init

tracking.init(owner=None, project=None, run_uuid=None, client=None, track_code=True, track_env=True, refresh_data=False, artifacts_path=None, collect_artifacts=None, collect_resources=None, is_offline=None, is_new=None, name=None, description=None, tags=None)

Tracking module is similar to the tracking client without the need to create a run instance.

The tracking module allows you to call all tracking methods directly from the top level module.

This could be very convenient especially if you are running in-cluster experiments:

from polyaxon import tracking

tracking.init()
...
tracking.log_metrics(step=1, loss=0.09, accuracy=0.75)
...
tracking.log_metrics(step=1, loss=0.02, accuracy=0.85)
...

A global TRACKING_RUN will be set on the module.

  • Args:

    • owner: str, optional, the owner is the username or the organization name owning this project.
    • project: str, optional, project name owning the run(s).
    • run_uuid: str, optional, run uuid.
    • client: PolyaxonClient, optional, an instance of a configured client, if not passed, a new instance will be created based on the available environment.
    • track_code: bool, optional, default True, to track code version. Polyaxon will try to track information about any repo configured in the context where this client is instantiated.
    • track_env: bool, optional, default True, to track information about the environment.
    • refresh_data: bool, optional, default False, to refresh the run data at instantiation.
    • refresh_data: bool, optional, default False, to instruct the run to resume, only useful when the run is not managed by Polyaxon.
    • artifacts_path: str, optional, for in-cluster runs it will be set automatically.
    • collect_artifacts: bool, optional, similar to the env var flag POLYAXON_COLLECT_ARTIFACTS, this env var is True by default for managed runs and is controlled by the plugins section.
    • collect_resources: bool, optional, similar to the env var flag POLYAXON_COLLECT_RESOURCES, this env var is True by default for managed runs and is controlled by the plugins section.
    • is_offline: bool, optional, To trigger the offline mode manually instead of depending on POLYAXON_IS_OFFLINE.
    • is_new: bool, optional, Force the creation of a new run instead of trying to discover a cached run or refreshing an instance from the env var
    • name: str, optional, When is_new or is_offline is set to true, a new instance is created and you can initialize that new run with a name.
    • description: str, optional, When is_new or is_offline is set to true, a new instance is created and you can initialize that new run with a description.
    • tags: str or List[str], optional, When is_new or is_offline is set to true, a new instance is created and you can initialize that new run with tags.
  • Raises:

    • PolyaxonClientException: If no owner and/or project are passed and Polyaxon cannot resolve the values from the environment.

get_or_create_run

get_or_create_run(tracking_run=None)

Get or create a new tracking run.

It tries to create a new instance, for in-cluster runs, this will work automatically.

This is used inside some Polyaxon callbacks, you should use init instead.


get_tensorboard_path

tracking.get_tensorboard_path(rel_path='tensorboard', use_store_path=False)

Returns a tensorboard path for this run relative to the outputs path.

If use_store_path is enabled, the path returned will be relative to the artifacts store path and not Polyaxon's context. Please note that, the library will not ensure that the path exists when this flag is set to true.

  • Args:
    • rel_path: str, optional, default "tensorboard", the relative path to the outputs context.
    • use_store_path: bool, default False.
  • Returns: str, outputs_path/rel_path

get_outputs_path

tracking.get_outputs_path(rel_path=None, ensure_path=True, is_dir=False, use_store_path=False)

Get the absolute outputs path of the specified artifact in the currently active run.

If rel_path is specified, the outputs artifact root path of the currently active run will be returned: root_run_artifacts_path/outputs/rel_path. If rel_path is not specified, the current root artifacts path configured for this instance will be returned: root_run_artifacts_path/outputs.

If ensure_path is provided, the path will be created. By default the path will be created until the last part of the rel_path argument, if is_dir is True, the complete rel_path is created.

If use_store_path is enabled, the path returned will be relative to the artifacts store path and not Polyaxon's context. Please note that, the library will not ensure that the path exists when this flag is set to true.

  • Args:
    • rel_path: str, optional.
    • ensure_path: bool, optional, default True.
    • is_dir: bool, optional, default False.
    • use_store_path: bool, default False.
  • Returns: str, outputs_path

get_artifacts_path

tracking.get_artifacts_path(rel_path=None, ensure_path=False, is_dir=False, use_store_path=False)

Get the absolute path of the specified artifact in the currently active run.

If rel_path is specified, the artifact root path of the currently active run will be returned: root_run_artifacts_path/rel_path. If rel_path is not specified, the current root artifacts path configured for this instance will be returned: root_run_artifacts_path.

If ensure_path is provided, the path will be created. By default the path will be created until the last part of the rel_path argument, if is_dir is True, the complete rel_path is created.

If use_store_path is enabled, the path returned will be relative to the artifacts store path and not Polyaxon's context. Please note that, the library will not ensure that the path exists when this flag is set to true.

  • Args:
    • rel_path: str, optional.
    • ensure_path: bool, optional, default True.
    • is_dir: bool, optional, default False.
    • use_store_path: bool, default False.
  • Returns: str, artifacts_path

log_metric

tracking.log_metric(name, value, step=None, timestamp=None)

Logs a metric datapoint.

log_metric(name="loss", value=0.01, step=10)

It's very important to log step as one of your metrics if you want to compare experiments on the dashboard and use the steps in x-axis instead of timestamps.

  • Args:
    • name: str, metric name
    • value: float, metric value
    • step: int, optional
    • timestamp: datetime, optional

log_metrics

tracking.log_metrics(step=None, timestamp=None)

Logs multiple metrics.

log_metrics(step=123, loss=0.023, accuracy=0.91)

It's very important to log step as one of your metrics if you want to compare experiments on the dashboard and use the steps in x-axis instead of timestamps.

  • Args:
    • step: int, optional
    • timestamp: datetime, optional
    • **metrics: **kwargs, key=value

log_image

tracking.log_image(data, name=None, step=None, timestamp=None, rescale=1, dataformats='CHW')

Logs an image.

log_image(data="path/to/image.png", step=10)
log_image(data=np_array, name="generated_image", step=10)
  • Args:
    • data: str or numpy.array, a file path or numpy array
    • name: str, name of the image, if a path is passed this can be optional and the name of the file will be used
    • step: int, optional
    • timestamp: datetime, optional
    • rescale: int, optional
    • dataformats: str, optional

log_image_with_boxes

tracking.log_image_with_boxes(tensor_image, tensor_boxes, name=None, step=None, timestamp=None, rescale=1, dataformats='CHW')

Logs an image with bounding boxes.

log_image_with_boxes(
    name="my_image",
    tensor_image=np.arange(np.prod((3, 32, 32)), dtype=float).reshape((3, 32, 32)),
    tensor_boxes=np.array([[10, 10, 40, 40]]),
)
  • Args:
    • tensor_image: numpy.array or str: Image data or file name
    • tensor_boxes: numpy.array or str: Box data (for detected objects) box should be represented as [x1, y1, x2, y2]
    • name: str, name of the image
    • step: int, optional
    • timestamp: datetime, optional
    • rescale: int, optional
    • dataformats: str, optional

log_mpl_image

tracking.log_mpl_image(data, name=None, close=True, step=None, timestamp=None)

Logs a matplotlib image.

log_mpl_image(name="figure", data=figure, step=1, close=False)
  • Args:
    • data: matplotlib.pyplot.figure or List[matplotlib.pyplot.figure]
    • name: sre, optional, name
    • close: bool, optional, default True
    • step: int, optional
    • timestamp: datetime, optional

log_video

tracking.log_video(data, name=None, fps=4, step=None, timestamp=None, content_type=None)

Logs a video.

log_video("path/to/my_video1"),
log_video(
    name="my_vide2",
    data=np.arange(np.prod((4, 3, 1, 8, 8)), dtype=float).reshape((4, 3, 1, 8, 8))
)
  • Args:
    • data: video data or str.
    • name: str, optional, if data is a filepath the name will be the name of the file
    • fps: int, optional, frames per second
    • step: int, optional
    • timestamp: datetime, optional
    • content_type: str, optional, default "gif"

log_audio

tracking.log_audio(data, name=None, sample_rate=44100, step=None, timestamp=None, content_type=None)

Logs a audio.

log_audio("path/to/my_audio1"),
log_audio(name="my_audio2", data=np.arange(np.prod((42,)), dtype=float).reshape((42,)))
  • Args:
    • data: str or audio data
    • name: str, optional, if data is a filepath the name will be the name of the file
    • sample_rate: int, optional, sample rate in Hz
    • step: int, optional
    • timestamp: datetime, optional
    • content_type: str, optional, default "wav"

log_text

tracking.log_text(name, text, step=None, timestamp=None)

Logs a text.

log_text(name="text", text="value")
  • Args:
    • name: str, name
    • text: str, text value
    • step: int, optional
    • timestamp: datetime, optional

log_html

tracking.log_html(name, html, step=None, timestamp=None)

Logs an html.

log_html(name="text", html="<p>value</p>")
  • Args:
    • name: str, name
    • html: str, text value
    • step: int, optional
    • timestamp: datetime, optional

log_np_histogram

tracking.log_np_histogram(name, values, counts, step=None, timestamp=None)

Logs a numpy histogram.

values, counts = np.histogram(np.random.randint(255, size=(1000,)))
log_np_histogram(name="histo1", values=values, counts=counts, step=1)
  • Args:
    • name: str, name
    • values: np.array
    • counts: np.array
    • step: int, optional
    • timestamp: datetime, optional

log_histogram

tracking.log_histogram(name, values, bins, max_bins=None, step=None, timestamp=None)

Logs a histogram.

log_histogram(
    name="histo",
    values=np.arange(np.prod((1024,)), dtype=float).reshape((1024,)),
    bins="auto",
    step=1
)
  • Args:
    • name: str, name
    • values: np.array
    • bins: int or str
    • max_bins: int, optional
    • step: int, optional
    • timestamp: datetime, optional

log_model

tracking.log_model(path, name=None, framework=None, summary=None, step=None, timestamp=None, rel_path='model', versioned=True)

Logs a model or a versioned model if versioned is true or a step value is provided.

This method will:

  • save the model
  • several versions of the model and create an event file if the step is provided.

Note 1: This method does a couple things:

  • It moves the model under the outputs or the assets directory if the step is provided
  • If the step is provided it creates an event file
  • It creates a lineage reference to the model or to the event file if the step is provided

Note 2: If you need to have more control over where the model should be saved and only record a lineage information of that path you can use log_model_ref.

  • Args:
    • path: str, path to the model to log
    • name: str, name
    • framework: str, optional ,name of the framework
    • summary: Dict, optional, key, value information about the model
    • step: int, optional
    • timestamp: datetime, optional
    • rel_path: str, relative path where to store the model
    • versioned: bool, to enable the versioned behavior for storing the model

log_dataframe

tracking.log_dataframe(df, name, content_type='csv', step=None, timestamp=None)

Logs a dataframe.

  • Args:
    • df: the dataframe to save
    • name: str, optional, if not provided the name of the file will be used
    • content_type: str, optional, csv or html.
    • step: int, optional
    • timestamp: datetime, optional

log_artifact

tracking.log_artifact(path, name=None, kind=None, summary=None, step=None, timestamp=None, rel_path=None, versioned=True)

Logs a generic artifact or a versioned generic artifact if versioned is true or a step value is provided.

This method will:

  • save the artifact
  • several versions of the artifact and create an event file if the step is provided.

Note 1: This method does a couple things:

  • It moves the artifact under the outputs or the assets directory if the step is provided
  • If the step is provided it creates an event file
  • It creates a lineage reference to the artifact or to the event file if the step is provided

Note 2: If you need to have more control over where the artifact should be saved and only record a lineage information of that path you can use log_artifact_ref.

  • Args:
    • path: str, path to the artifact
    • name: str, optional, if not provided the name of the file will be used
    • kind: optional, str
    • summary: Dict, optional, additional summary information to log about data in the lineage table.
    • step: int, optional
    • timestamp: datetime, optional
    • rel_path: str, relative path where to store the artifacts
    • versioned: bool, to enable the versioned behavior for storing the artifact

log_roc_auc_curve

tracking.log_roc_auc_curve(name, fpr, tpr, auc=None, step=None, timestamp=None)

Logs ROC/AUC curve. This method expects an already processed values.

log_roc_auc_curve("roc_value", fpr, tpr, auc=0.6, step=1)
  • Args:
    • name: str, name of the curve
    • fpr: List[float] or numpy.array, false positive rate
    • tpr: List[float] or numpy.array, true positive rate
    • auc: float, optional, calculated area under curve
    • step: int, optional
    • timestamp: datetime, optional

log_sklearn_roc_auc_curve

tracking.log_sklearn_roc_auc_curve(name, y_preds, y_targets, step=None, timestamp=None, is_multi_class=False)

Calculates and logs ROC/AUC curve using sklearn.

log_sklearn_roc_auc_curve("roc_value", y_preds, y_targets, step=10)

If you are logging a multi-class roc curve, you should set is_multi_class=True to allow persisting curves for all classes.

  • Args:
    • name: str, name of the curve
    • y_preds: List[float] or numpy.array
    • y_targets: List[float] or numpy.array
    • step: int, optional
    • timestamp: datetime, optional
    • is_multi_class: bool, optional

log_pr_curve

tracking.log_pr_curve(name, precision, recall, average_precision=None, step=None, timestamp=None)

Logs PR curve. This method expects an already processed values.

log_pr_curve("pr_value", precision, recall, step=10)
  • Args:
    • name: str, name of the curve
    • y_preds: List[float] or numpy.array
    • y_targets: List[float] or numpy.array
    • step: int, optional
    • timestamp: datetime, optional

log_sklearn_pr_curve

tracking.log_sklearn_pr_curve(name, y_preds, y_targets, step=None, timestamp=None, is_multi_class=False)

Calculates and logs PR curve using sklearn.

log_sklearn_pr_curve("pr_value", y_preds, y_targets, step=10)

If you are logging a multi-class roc curve, you should set is_multi_class=True to allow persisting curves for all classes.

  • Args:
    • name: str, name of the event
    • y_preds: List[float] or numpy.array
    • y_targets: List[float] or numpy.array
    • step: int, optional
    • timestamp: datetime, optional
    • is_multi_class: bool, optional

log_curve

tracking.log_curve(name, x, y, annotation=None, step=None, timestamp=None)

Logs a custom curve.

log_curve("pr_value", x, y, annotation="more=info", step=10)
  • Args:
    • name: str, name of the curve
    • x: List[float] or numpy.array
    • y: List[float] or numpy.array
    • annotation: str, optional
    • step: int, optional
    • timestamp: datetime, optional

log_plotly_chart

tracking.log_plotly_chart(name, figure, step=None, timestamp=None)

Logs a plotly chart/figure.

  • Args:
    • name: str, name of the figure
    • figure: plotly.figure
    • step: int, optional
    • timestamp: datetime, optional

log_bokeh_chart

tracking.log_bokeh_chart(name, figure, step=None, timestamp=None)

Logs a bokeh chart/figure.

  • Args:
    • name: str, name of the figure
    • figure: bokeh.figure
    • step: int, optional
    • timestamp: datetime, optional

log_altair_chart

tracking.log_altair_chart(name, figure, step=None, timestamp=None)

Logs a vega/altair chart/figure.

  • Args:
    • name: str, name of the figure
    • figure: figure
    • step: int, optional
    • timestamp: datetime, optional

log_mpl_plotly_chart

tracking.log_mpl_plotly_chart(name, figure, step=None, timestamp=None)

Logs a matplotlib figure to plotly figure.

  • Args:
    • name: str, name of the figure
    • figure: figure
    • step: int, optional
    • timestamp: datetime, optional

set_description

tracking.set_description(description, async_req=True)

Sets a new description for the current run.

  • Args:
    • description: str, the description to set.
    • async_req: bool, optional, default: False, execute request asynchronously.

set_name

tracking.set_name(name, async_req=True)

Sets a new name for the current run.

  • Args:
    • name: str, the name to set.
    • async_req: bool, optional, default: False, execute request asynchronously.

log_status

tracking.log_status(status, reason=None, message=None, last_transition_time=None, last_update_time=None)

Logs a new run status.

N.B. If you are executing a managed run, you don't need to call this method manually. This method is only useful for manual runs outside of Polyaxon.

N.B you will probably use one of the simpler methods:

  • log_succeeded
  • log_stopped
  • log_failed
  • start
  • end

Run API

  • Args:
    • status: str, a valid Statuses value.
    • reason: str, optional, reason or service issuing the status change.
    • message: str, optional, message to log with this status.
    • last_transition_time: datetime, default now.
    • last_update_time: datetime, default now.

log_inputs

tracking.log_inputs(reset=False, async_req=True)

Logs or resets new inputs/params for the current run.

Note: If you are starting a run from the CLI/UI polyaxon will track all inputs from the Polyaxonfile, so you generally don't need to set them manually. But you can always add or reset these params/inputs once your code starts running.

  • Args:
    • reset: bool, optional, if True, it will reset the whole inputs state. Note that Polyaxon will automatically populate the inputs based on the Polyaxonfile inputs definition and params passed.
    • async_req: bool, optional, default: False, execute request asynchronously.
    • inputs: **kwargs, e.g. param1=value1, param2=value2, ...

log_outputs

tracking.log_outputs(reset=False, async_req=True)

Logs a new outputs/results for the current run.

  • Args:
    • reset: bool, optional, if True, it will reset the whole outputs state. Note that Polyaxon will automatically populate some outputs based on the Polyaxonfile outputs definition and params passed.
    • async_req: bool, optional, default: False, execute request asynchronously.
    • outputs: **kwargs, e.g. output1=value1, metric2=value2, ...

log_tags

tracking.log_tags(tags, reset=False, async_req=True)

Logs new tags for the current run.

  • Args:
    • tags: str or List[str], tag or tags to log.
    • reset: bool, optional, if True, it will reset the whole tags state. Note that Polyaxon will automatically populate the tags based on the Polyaxonfile.
    • async_req: bool, optional, default: False, execute request asynchronously.

log_meta

tracking.log_meta(reset=False, async_req=True)

Logs meta_info for the current run.

Note: Use carefully! The meta information is used by Polyaxon internally to perform several information.

Polyaxon Client already uses this method to log information about several events and artifacts, Polyaxon API/Scheduler uses this information to set meta information about the run.

An example use case for this method is to update the concurrency of a pipeline to increase/decrease the initial value:

from polyaxon.client import RunClient
client = RunClient()
client.log_meta(concurrency=5)
  • Args:
    • reset: bool, optional, if True, it will reset the whole meta info state.
    • async_req: bool, optional, default: False, execute request asynchronously.
    • meta: **kwargs, e.g. concurrency=10, has_flag=True, ...

log_succeeded

tracking.log_succeeded()

Sets the current run to succeeded status.

N.B. If you are executing a managed run, you don't need to call this method manually. This method is only useful for manual runs outside of Polyaxon.

log_stopped

tracking.log_stopped()

Sets the current run to stopped status.

N.B. If you are executing a managed run, you don't need to call this method manually. This method is only useful for manual runs outside of Polyaxon.

log_failed

tracking.log_failed(reason=None, message=None)

Sets the current run to failed status.

N.B. If you are executing a managed run, you don't need to call this method manually. This method is only useful for manual runs outside of Polyaxon.
  • Args:
    • reason: str, optional, reason or service issuing the status change.
    • message: str, optional, message to log with this status.

end

tracking.end()

Manually end a run and trigger post done logic (artifacts and lineage collection).


log_model_ref

tracking.log_model_ref(path, name=None, framework=None, summary=None, is_input=False, rel_path=None)

Logs model reference.

Note: The difference between this method and the log_model is that this one does not copy or move the asset, it only registers a lineage reference. If you need the model asset to be on the artifacts_path or the outputs_path you have to copy it manually using a relative path to self.get_artifacts_path or self.get_outputs_path.

# Get outputs artifact path
asset_path = tracking.get_outputs_path("model/model_data.h5")
with open(asset_path, "w") as f:
   f.write("Artifact content.")
# Log reference to the lineage table
# Name of the artifact will default to model_data
tracking.log_model_ref(path=asset_path)
  • Args:
    • path: str, filepath, the name is extracted from the filepath.
    • name: str, if the name is passed it will be used instead of the filename from the path.
    • framework: str, optional ,name of the framework
    • summary: Dict, optional, additional summary information to log about data in the lineage table.
    • is_input: bool, if the file reference is an input or outputs.
    • rel_path: str, optional relative path to the run artifacts path.

log_code_ref

tracking.log_code_ref(code_ref=None, is_input=True)

Logs code reference as a lineage information with the code_ref dictionary in the summary field.

In offline

  • Args:
    • code_ref: dict, optional, if not provided, Polyaxon will detect the code reference from the git repo in the current path.
    • is_input: bool, if the code reference is an input or outputs.

log_data_ref

tracking.log_data_ref(name, hash=None, path=None, content=None, summary=None, is_input=True)

Logs data reference.

  • Args:
    • name: str, name of the data.
    • hash: str, optional, default = None, the hash version of the data, if not provided it will be calculated based on the data in the content.
    • path: str, optional, path of where the data is coming from.
    • summary: Dict, optional, additional summary information to log about data in the lineage table.
    • is_input: bool, if the data reference is an input or outputs.
    • content: the data content.

log_file_ref

tracking.log_file_ref(path, name=None, hash=None, content=None, summary=None, is_input=True, rel_path=None)

Logs file reference.

  • Args:
    • path: str, filepath, the name is extracted from the filepath.
    • name: str, if the name is passed it will be used instead of the filename from the path.
    • hash: str, optional, default = None, the hash version of the file, if not provided it will be calculated based on the file content.
    • content: the file content.
    • summary: Dict, optional, additional summary information to log about data in the lineage table.
    • is_input: bool, if the file reference is an input or outputs.
    • rel_path: str, optional relative path to the run artifacts path.

log_dir_ref

tracking.log_dir_ref(path, name=None, summary=None, is_input=False, rel_path=None)

Logs dir reference.

  • Args:
    • path: str, dir path, the name is extracted from the path.
    • name: str, if the name is passed it will be used instead of the dirname from the path.
    • summary: Dict, optional, additional summary information to log about data in the lineage table.
    • is_input: bool, if the dir reference is an input or outputs.
    • rel_path: str, optional relative path to the run artifacts path.

log_artifact_lineage

tracking.log_artifact_lineage(body)

Logs an artifact lineage.

Note: This method can be used to log manual lineage objects, it is used internally to log model/file/artifact/code refs

  • Args:
    • body: dict or List[dict] or V1RunArtifact or List[V1RunArtifact], body of the lineage.
    • async_req: bool, optional, default: False, execute request asynchronously.

log_env

tracking.log_env(rel_path=None, content=None)

Logs information about the environment.

Called automatically if track_env is set to True.

Can be called manually, and can accept a custom content as a form of a dictionary.

  • Args:
    • rel_path: str, optional, default "env.json".
    • content: Dict, optional, default to current system information.

set_artifacts_path

tracking.set_artifacts_path(artifacts_path, is_related=False)

Sets the root artifacts_path.

Note: Both in-cluster and offline modes will call this method automatically. Be careful, this method is called automatically. Polyaxon has some processes to automatically sync your run's artifacts and outputs.

  • Args:
    • artifacts_path: str, optional
    • is_related: bool, optional, To create multiple runs in-cluster in a notebook or a vscode session.

set_run_event_logger

tracking.set_run_event_logger()

Sets an event logger.

Note: Both in-cluster and offline modes will call this method automatically. Be careful, this method is called automatically. Polyaxon has some processes to automatically sync your run's artifacts and outputs.


set_run_resource_logger

tracking.set_run_resource_logger()

Sets an resources logger.

Note: Both in-cluster and offline modes will call this method automatically. Be careful, this method is called automatically. Polyaxon has some processes to automatically sync your run's artifacts and outputs.


sync_events_summaries

tracking.sync_events_summaries()

Syncs all tracked events and auto-generates summaries and lineage data.

Note: Both in-cluster and offline modes will manage syncing events summaries automatically, so you should not call this method manually.