Polyaxon allows to schedule Pytorch experiments and distributed Pytorch experiments, and supports tracking metrics, outputs, and models.
With Polyaxon you can:
- log hyperparameters for every run
- see learning curves for losses and metrics during training
- see hardware consumption and stdout/stderr output during training
- log images, charts, and other assets
- log git commit information
- log env information
- log model
- …
Tracking API
Polyaxon provides a tracking API to track experiment and report metrics, artifacts, logs, and results to the Polyaxon dashboard.
You can use the tracking API to create a custom tracking experience with Pytorch.
Setup
In order to use Polyaxon tracking with Pytorch, you need to install Polyaxon library
pip install polyaxon
Initialize your script with Polyaxon
This is an optional step if you need to perform some manual tracking or to track some information before passing the callback.
from polyaxon import tracking
tracking.init(...)
Manual logging
If you want to have more control and use Polyaxon to log metrics in your custom TensorFlow training loops:
- log metrics
for batch_idx, (data, target) in enumerate(train_loader):
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
tracking.log_mtrics(loss=loss)
- log the model
asset_path = tracking.get_outputs_path('model.ckpt')
torch.save(model.state_dict(), asset_path)
# log model
tracking.log_artifact_ref(asset_path, framework="pytorch", ...)