Experiment Group Metrics
Experiment groups in Polyaxon are how internally the platform runs and manages hyperparameters search and optimization, Polyaxon uses a concept similar to Google Vizier to search hyperparameters spaces and suggest new experiments to the scheduler to train them.
Previously, Polyaxon’s users were forced to start a tensorboard to compare experiments within a group. Sometimes the groups might contain a very large number of experiments, which made the tensorboards very slow.
Today’s release brings a new feature, allowing users to have an idea about how their hyperparameters tuning group performed. Users now can visualize the impact of one parameter’s values on a specific metric or multiple metrics:
Users can also choose to compare experiments within a group:
We are still evaluating other visualizations, and we would love your feedback on this new feature.
Readme and note taking
This release brings also a new collaboration feature; readme for experiments, jobs, groups, and projects.
You can now add more detailed and meaningful notes to your projects and experiments.
We also made some UI/UX improvements to the dashboard; from now on it should be very easy to update the description and the tags directly from the dashboard.
Health status
As the platform matures, we want to give users a quick way to check on the health of the running services and components. In the future, we will introduce more functionalities to allow the users to easily manage the number of processes of these services, enable auto-scaling, and automatically stop services when they are idle (the auto-scaling here refers to the management of the Polyaxon’s components which is different from the Kubernetes’s auto-scaling that you can enable to provision more nodes as you train more experiments).
Tracking reference
As we are approaching the 0.3 release, the recommended way to track experiments will be using the polyaxon-client
instead of polyaxon-helper
which is getting deprecated. We updated our documentation to reflect this change, and we will also add new examples to show, how you can use the new tracking API to do both instrumentations for experiments running in-cluster and on other environments.
Other improvements and bug fixes
We fixed several issues, one notable issue was related to uploads, in some particular deployments with ingress enabled, users had issues uploading large files; we updated some annotations to allow the upload of large files, and to keep the connections from timing out.
Conclusion
Polyaxon will keep improving and providing the simplest machine learning layer on top of Kubernetes. We hope that these updates will improve your workflows and increase your productivity, and again, thank you for your continued feedback and support.