You can use one or multiple blobs on Azure Storage to access data directly on your machine learning experiments and jobs.
Create an Azure Storage account
You should create a storage account (e.g. plx-storage) and a blob (e.g. data).
You need to expose information about how to connect to the blob storage, the standard way is to expose these keys:
AZURE_ACCOUNT_NAME
AZURE_ACCOUNT_KEY
AZURE_CONNECTION_STRING
Create a secret or a config map for storing these keys
We recommend using a secret to store your access information json object:
kubectl create secret -n polyaxon generic az-secret --from-literal=AZURE_ACCOUNT_NAME=account --from-literal=AZURE_ACCOUNT_KEY=hash-key
Use the secret to add a connection
connections:
- name: azure-dataset1
kind: wasb
schema:
bucket: "wasbs://[email protected]/"
secret:
name: "az-secret"
If you want ot access multiple datasets using the same secret:
persistence:
- name: azure-dataset1
kind: wasb
schema:
bucket: "wasbs://[email protected]/"
secret:
name: "az-secret"
- name: azure-dataset2
kind: wasb
schema:
bucket: "wasbs://[email protected]/"
secret:
name: "az-secret"
Update/Install Polyaxon CE or Polyaxon Agent deployment
You can deploy/upgrade your Polyaxon CE or Polyaxon Agent deployment with access to data on Azure.
Access to the dataset in your experiments/jobs
To expose the connection secret to one of the containers in your jobs or services:
run:
kind: job
connections: [azure-dataset1]
Or
run:
kind: job
connections: [azure-dataset1, s3-dataset1]
Use the initializer to load the dataset
To use the artifacts initializer to load the dataset
run:
kind: job
init:
- artifacts: [dirs: [...], files: [...]]
connection: "azure-dataset1"
Use Polyaxon to access the dataset
This is optional, you can use any language or logic to interacts with Azure Storage.
Polyaxon has some built-in logic that you can leverage if you want.
To use that logic:
pip install polyaxon[azure]
All possible functions to use:
from polyaxon.connections.azure.azure_blobstore import AzureBlobStoreService
store = AzureBlobStoreService(...)
store.delete()
store.ls()
store.upload_file()
store.upload_dir()
store.download_file()
store.download_dir()