Polyaxon allows to schedule Pytorch experiments and distributed Pytorch experiments, and supports tracking metrics, outputs, and models.

With Polyaxon you can:

  • log hyperparameters for every run
  • see learning curves for losses and metrics during training
  • see hardware consumption and stdout/stderr output during training
  • log images, charts, and other assets
  • log git commit information
  • log env information
  • log model
  • ...

Tracking API

Polyaxon provides a tracking API to track experiment and report metrics, artifacts, logs, and results to the Polyaxon dashboard.

You can use the tracking API to create a custom tracking experience with Pytorch.

Setup

In order to use Polyaxon tracking with Pytorch, you need to install Polyaxon Client

pip install polyaxon

Initialize your script with Polyaxon

This is an optional step if you need to perform some manual tracking or to track some information before passing the callback.

from polyaxon import tracking

tracking.init(...)

Manual logging

If you want to have more control and use Polyaxon to log metrics in your custom TensorFlow training loops:

  • log metrics
for batch_idx, (data, target) in enumerate(train_loader):
    output = model(data)
    loss = F.nll_loss(output, target)
    loss.backward()
    optimizer.step()
    tracking.log_mtrics(loss=loss)
  • log the model
asset_path = tracking.get_outputs_path('model.ckpt')
torch.save(model.state_dict(), asset_path)

# log model
tracking.log_artifact_ref(asset_path, framework="pytorch", ...)