Distributed Mode (SPMD)

Extension for Scikit-learn* offers Single Program, Multiple Data (SPMD) supported interfaces for distributed computing. Several GPU-supported algorithms also provide distributed, multi-GPU computing capabilities via integration with mpi4py. The prerequisites match those of GPU computing, along with an MPI backend of your choice (Intel MPI recommended, available via impi_rt python package) and the mpi4py python package. If using Extension for Scikit-learn* installed from sources, ensure that the spmd_backend is built.

Important

SMPD mode requires the mpi4py package used at runtime to be compiled with the same MPI backend as the Extension for Scikit-learn*. The PyPI and Conda distributions of Extension for Scikit-learn* both use Intel’s MPI as backend, and hence require an mpi4py also built with Intel’s MPI - it can be easily installed from Intel’s conda channel as follows:

conda install -c https://software.repos.intel.com/python/conda/ mpi4py

It also requires the MPI runtime executable (mpiexec / mpirun) to be from the same library that was used to compile the Extension for Scikit-learn* - Intel’s MPI runtime library is offered as a Python package impi_rt and will be installed together with the mpi4py package if executing the command above, but otherwise, it can be installed separately from different distribution channels:

  • Intel’s conda channel (recommended):

    conda install -c https://software.repos.intel.com/python/conda/ impi_rt
    
  • Conda-Forge:

    conda install -c conda-forge impi_rt
    
  • PyPI (not recommended, might require setting additional environment variables):

    pip install impi_rt
    

Using other MPI backends (e.g. OpenMPI) requires building Extension for Scikit-learn* from source with that backend.

Note that Extension for Scikit-learn* supports GPU offloading to speed up MPI operations. This is supported automatically with some MPI backends, but in order to use GPU offloading with Intel MPI, it is required to set the environment variable I_MPI_OFFLOAD to 1 (providing data on device without this may lead to a runtime error):

  • On Linux*:

    export I_MPI_OFFLOAD=1
    
  • On Windows*:

    set I_MPI_OFFLOAD=1
    

SMPD-aware versions of estimators can be imported from the sklearnex.spmd module. Data should be distributed across multiple nodes as desired, and should be transfered to a dpctl or dpnp array before being passed to the estimator.

Note that SPMD estimators allow an additional argument queue in their .fit / .predict methods, which accept dpctl.SyclQueue objects. For example, while the signature for sklearn.linear_model.LinearRegression.predict would be

def predict(self, X): ...

The signature for the corresponding predict method in sklearnex.spmd.linear_model.LinearRegression.predict is:

def predict(self, X, queue=None): ...

Examples of SPMD usage can be found in the GitHub repository for the Extension for Scikit-learn* under examples/sklearnex.

To run on SPMD mode, first create a python file using SPMD estimators from sklearnex.spmd, such as linear_regression_spmd.py.

Then, execute the file through MPI under multiple ranks - for example:

  • On Linux*:

    mpirun -n 4 python linear_regression_spmd.py
    
  • On Windows*:

    mpiexec -n 4 python linear_regression_spmd.py
    

Note that additional mpirun arguments can be added as desired. SPMD-supported estimators are listed in the SPMD Support section.

Additionally, daal4py (previously a separate package, now an importable module within scikit-learn-intelex) offers some distributed functionality, see documentation for further details.