About daal4py
Introduction
daal4py
is a low-level module within the Extension for Scikit-learn* package providing Python bindings
over the oneAPI Data Analytics Library. It has been deprecated in favor of the newer sklearnex
module in the
same package, which offers a more idiomatic and higher-level interface for calling accelerated
routines from the oneAPI Data Analytics Library in Python.
Internally, daal4py
is a Python wrapper over the now-deprecated “DAAL” interface
of the oneAPI Data Analytics Library, while sklearnex
is a module built atop of the “oneAPI” interface, offering
DPC-based features such as GPU support.
There is a large degree of overlap in the functionalities offered between the two modules
daal4py
and sklearnex
- module sklearnex
should be prefered whenever possible,
either by using it directly or through the patching mechanism - but daal4py
exposes some additional functionalities from the oneAPI Data Analytics Library that sklearnex
doesn’t:
Fast serving of gradient boosted decision trees from other libraries such as XGBoost (model builders).
Previously daal4py
was distributed as a separate package, but it is now an importable module
within the scikit-learn-intelex
package - meaning, after installing scikit-learn-intelex
,
it can be imported as follows:
import daal4py
For documentation about specific functions, see the daal4py API reference.
Using daal4py
Unlike sklearnex
, daal4py
, being a lower-level interface, does not follow scikit-learn
idioms - instead, the process for calling procedures from the daal4py
interface is as follows:
Instantiate an ‘algorithm’ class by calling its contructor, without any data - for example:
qr_algo = daal4py.qr()
.Call the ‘compute’ method of that instantiated algorithm in order to obtain a ‘result’ object, passing it the data on which it will operate - for example:
qr_result = qr_algo.compute(X)
.Access the relevant results in the ‘result’ object - for example:
R = qr_result.matrixR
.
Full example calling the QR algorithm:
import daal4py
import numpy as np
rng = np.random.default_rng(seed=123)
X = rng.standard_normal(size=(100,5))
qr_algo = daal4py.qr()
qr_result = qr_algo.compute(X)
np.testing.assert_almost_equal(
np.abs( qr_result.matrixR ),
np.abs( np.linalg.qr(X).R ),
)
Note
QR factorization, unlike other linear algebra procedures, does not have a strictly unique solution - if the signs (+/-) of numbers are flipped for a particular column in both the Q and R matrices, they would still be valid and equivalent QR factorizations of the same original matrix ‘X’.
Procedures like Cholesky decomposition are typically constrained to have only positive signs in the main diagonal in order to make the results deterministic, but this is not always the case for QR in most software, hence the example above takes the absolute values when comparing results from different libraries.
Streaming mode
Many algorithms in daal4py
accept an argument streaming=True
, which allows executing the
computations in a ‘streaming’ or ‘online’ fashion, by supplying it different subsets of the data,
one at a time (batches), instead of passing the whole data upfront, while still arriving at the
same final result as if all the data had been passed at once.
Note
The sklearnex
module also offers incremental versions of some algorithms - see the docs
on Non-Scikit-Learn Algorithms for more details.
This can be useful for executing algorithms on large datasets that don’t fit in memory but which can still be loaded in smaller chunks, or for machine learning models that are constantly being updated as new data is collected, for example.
In order to use streaming mode, the algorithm constructor needs to be passed argument streaming=True
,
method .compute()
needs to be called multiple times with different data, and the ‘result’
object should be obtained by calling method .finalize()
after all the data has been passed.
Example:
import daal4py
import numpy as np
rng = np.random.default_rng(seed=123)
X_full = rng.standard_normal(size=(100,5))
batches = np.split(np.arange(100), 5)
qr_algo = daal4py.qr(streaming=True)
for batch in batches:
X_batch = X_full[batch]
qr_algo.compute(X_batch)
qr_result = qr_algo.finalize()
np.testing.assert_almost_equal(
np.abs( qr_result.matrixR ),
np.abs( np.linalg.qr(X).R ),
)
List of algorithms in daal4py
supporting streaming mode:
Distributed mode
Many algorithms in daal4py
accept an argument distributed=True
, which allows
running computations in a distributed compute nodes using the MPI framework.
See the section Distributed mode (daal4py, CPU) for more details.
Documentation
See daal4py API Reference for the full documentation of functions and classes.