Polyaxon allows to schedule Tensorflow experiments and distributed Tensorflow experiments, and supports tracking metrics, outputs, and models.
With Polyaxon you can:
- log hyperparameters for every run
- see learning curves for losses and metrics during training
- see hardware consumption and stdout/stderr output during training
- log images, charts, and other assets
- log git commit information
- log env information
- log model
Polyaxon provides a tracking API to track experiment and report metrics, artifacts, logs, and results to the Polyaxon dashboard.
You can use the tracking API to create a custom tracking experience with Tensorflow.
In order to use Polyaxon tracking with Tensorflow, you need to install Polyaxon Client
pip install polyaxon
This is an optional step if you need to perform some manual tracking or to track some information before passing the callback.
from polyaxon import tracking tracking.init(...)
Polyaxon provides a Tensorflow callback, you can use this callback with your experiment to report metrics automatically
from polyaxon.tracking.contrib.tensorflow import PolyaxonCallback ... estimator.train(hooks=[PolyaxonCallback(...)]) ...
Polyaxon's callback can be customized to alter the default behavior:
- It will use the current initialized run unless you pass a different run
- You can enable images logging
- You can enable histograms logging
- You can enable tensors logging
PolyaxonCallback(run=run, log_image=True, log_histo=True, log_tensor=True)
If you want to have more control and use Polyaxon to log metrics in your custom TensorFlow training loops:
from polyaxon import tracking with tf.GradientTape() as tape: # Get the probabilities predictions = model(features) # Calculate the loss loss = loss_func(labels, predictions) # Log your metrics tracking.log_metrics(loss=loss.numpy())
To make sure the model is uploaded to your artifacts store, you can pass
get_outputs_path("model_rel_path", is_dir=True) to your checkpoint dir:
from polyaxon import tracking ... tracking.init() ... model_dir = tracking.get_outputs_path("model", is_dir=True) classifier = tf.estimator.LinearClassifier( model_dir=model_dir, feature_columns=[...], n_classes=2 ) tracking.log_model_ref(model_dir, framework="tensorflow", ...) ... classifier.train(input_fn=train_input_fn, steps=100000, hooks=[PolyaxonCallback()]) ...