Polyaxon + Rapids Integration

Polyaxon & Rapids

How to use Polyaxon and Rapids together

Polyaxon allows users to achieve up to 10x speedups in data preprocessing and train models at scale using Rapids.

Note: Users should also look at CuPy which is a NumPy-compatible array library accelerated by CUDA.

Requirements

To use Polyaxon and RAPIDS to accelerate model training, there are a few requirements:

Polyaxon schedules containerized workload, which makes creating compatible docker images with Rapids very simple.

name: Rapids
channels:
- rapidsai
- nvidia
- conda-forge
dependencies:
- rapids=0.X

Or conda command:

conda create -n rapids-0.19 -c rapidsai -c nvidia -c conda-forge \
    rapids-blazing=0.19 python=3.7 cudatoolkit=10.2

image: rapidsai/rapidsai:0.19-cuda10.2-runtime-ubuntu18.04-py3.7

...

After building and pushing your custom images to a Docker registry, you can run jobs, experiments, or notebooks with the Rapids suite of libraries.

For data manipulation, users can leverage the cuDF which is a drop-in replacement for pandas for manipulating Dataframe.
For feature engineering, NVTabular, which sits atop RAPIDS, offers high-level abstractions for feature engineering and building recommenders.
For ML algorithms, Rapids offers cuML a GPU-accelerated version of sklearn’s algorithms.

By using the Rapids libraries, Polyaxon’s users can easily scale their data processing and model development with very few changes to their code.