Supported input types

Just like scikit-learn estimators, estimators from the Extension for Scikit-learn* are able to work with different classes of input data, including:

In addition, Extension for Scikit-learn* also supports dpnp.ndarray arrays (with and without array API mode) in estimators with GPU support (see also Array API support).

Extension for Scikit-learn* currently does not offer accelerated routines for input types not listed here - when receiving an unsupported class, estimators will either convert to a supported class under some circumstances (e.g. PyArrow tables might get converted to NumPy arrays when passed to data validators from stock scikit-learn), throw an error (e.g. when passing some data format not that’s not recognized by scikit-learn), or fall back to stock scikit-learn to handle it (when array API is enabled but the input is unsupported).

Warning

In some cases, data passed to estimators might be copied/duplicated during calls to methods such as fit/predict. The affected cases are listed below:

  • Non-contiguous NumPy array - i.e. where strides are wider than one element across both rows and columns.

  • For SciPy CSR matrix / array, index arrays are always copied. Note that sparse matrices in formats other than CSR will be converted to CSR, which implies more than just data copying.

  • Heterogeneous NumPy array.

  • If a SYCL queue is used for Target offload option for a device without float64 support but data is float64, data will be converted to float32.

  • If a dpnp.ndarray array on GPU is used as input without Array API support being enabled, then data will be transferred to CPU for validations and then back to GPU for computations (see GPU support for details).

  • If a dpnp.ndarray array on GPU is passed to an estimator with GPU support but the requested operation is not supported on GPU, and if array API is not enabled or scikit-learn does not support array API for the requested operation, then data might be transferred to CPU and the operation done there (see Configuration Contexts and Global Options).