Polyaxon allows to schedule distributed Horovod experiments, and supports tracking metrics, outputs, and models.

MPIJob Operator

Polyaxon provides support for Horovod via the MPIJob Operator.

Define the distributed topology

Please check the guide Running Horovod for more details how to set a Horovod experiment with MPI.