Polyaxon allows to schedule Pytorch experiments and distributed Pytorch experiments, and supports tracking metrics, outputs, and models.
With Polyaxon you can:
- log hyperparameters for every run
- see learning curves for losses and metrics during training
- see hardware consumption and stdout/stderr output during training
- log images, charts, and other assets
- log git commit information
- log env information
- log model
Polyaxon provides a tracking API to track experiment and report metrics, artifacts, logs, and results to the Polyaxon dashboard.
You can use the tracking API to create a custom tracking experience with Pytorch.
In order to use Polyaxon tracking with Pytorch, you need to install Polyaxon Client
pip install polyaxon
This is an optional step if you need to perform some manual tracking or to track some information before passing the callback.
from polyaxon import tracking tracking.init(...)
If you want to have more control and use Polyaxon to log metrics in your custom TensorFlow training loops:
- log metrics
for batch_idx, (data, target) in enumerate(train_loader): output = model(data) loss = F.nll_loss(output, target) loss.backward() optimizer.step() tracking.log_mtrics(loss=loss)
- log the model
asset_path = tracking.get_outputs_path('model.ckpt') torch.save(model.state_dict(), asset_path) # log model tracking.log_artifact_ref(asset_path, framework="pytorch", ...)