daal4py API Reference

This is the full documentation page for daal4py functions and classes. Note that for the most part, these are simple wrappers over equivalent functions and methods from the oneAPI Data Analytics Library. See also the documentation of DAAL interfaces for more details.

See About daal4py for an example of how to use daal4py algorithms.

Thread control

Documentation for functions that control the global thread settings in daal4py:

daal4py.daalinit(nthreads: int = -1) None

Set number of threads for daal4py

This modifies the number of threads configured for daal4py, which is a global setting - meaning: it is applied to all subsequent calls to daal4py functions / methods in the Python process.

By default, if not otherwise configured, it will use the full number of threads available in the system.

Parameters:

nthreads (int) – [default: -1] Number of threads to use for further computations in daal4py. If this number is less or equal than zero, then settings will not be changed.

Return type:

None

daal4py.num_threads() int

Gets number of threads configured for daal4py.

Note

The number of threads for daal4py is a global setting, which can be changed through daalinit.

Return type:

int

daal4py.enable_thread_pinning(enabled: bool = True) None

Enable or disable thread pinning

This function enables or disables binding of the threads that are used to parallelize algorithms of the library to physical processing units for possible performance improvement. Improper use of the method can result in degradation of the application performance depending on the system (machine) topology, application, and operating system. By default, pinning is disabled.

Note

This is a global setting for daal4py.

Parameters:

enabled (bool) – [default: True] Whether to enable thread pinning or not.

Return type:

None

MPI helpers

Documentation for helper functions that can be used in distributed mode, particularly when using MPI without mpi4py. See Distributed mode (daal4py, CPU) for examples.

daal4py.daalfini() None

Finalize MPI environment

When using distributed mode without mpi4py, this function must be called after the distributed computation calls before accessing the result object from the algorithm that was executed in distributed mode. It has no effect when the python process is not run through MPI (used for distributed mode).

This is a wrapper over MPI_Finalize. It does not need to be called if mpi4py was imported before, as mpi4py calls this function upon process exit.

Note that software mpi4py calls this function automatically if it is imported, but it only does so upon process exit, so this still needs to be called before accessing the result objects in the process/rank that will use them.

Return type:

None

daal4py.num_procs() int

Get number of MPI processes (in distributed mode)

If the python process is not run though MPI, this function will always return 1.

This is a wrapper over MPI_Comm_size. Equivalent to mpi4py.MPI.Comm.Get_size, but does not require mpi4py to be installed.

Return type:

int

daal4py.my_procid() int

Get MPI process rank

If the python process is not being run through MPI (used for distributed mode), this will always return zero.

This is a wrapper over MPI_Comm_rank. Equivalent to mpi4py.MPI.Comm.Get_rank, but does not require mpi4py to be installed.

Return type:

int

Model builders (GBT and LogReg serving)

Documentation for model builders, which allow computing fast predictions from GBT (gradient-boosted decision tree) models produced by other libraries. See article Serving GBT models from other libraries for examples.

daal4py.mb.convert_model(model) GBTDAALModel | LogisticDAALModel[source]

Convert GBT or LogReg models to Daal4Py

This function can be used to convert machine learning models / estimators created through other libraries to daal4py classes which offer accelerated prediction methods.

It supports gradient-boosted decision tree ensembles (GBT) from the libraries xgboost, lightgbm, and catboost; and logistic regression (binary and multinomial) models from scikit-learn.

See the documentation of the classes daal4py.mb.GBTDAALModel and daal4py.mb.LogisticDAALModel for more details.

As an alternative to this function, models of a specific type (GBT or LogReg) can also be instantiated by calling those classes directly - for example, logistic regression models can be instantiated directly from fitted coefficients and intercepts, thereby allowing to work with models from libraries beyond scikit-learn.

Parameters:

model (fitted model object) – A fitted model object (either GBT or LogReg) from the supported libraries.

Returns:

obj – A daal4py model object of the corresponding class for the model type, which offers faster prediction methods.

Return type:

GBTDAALModel or LogisticDAALModel

class daal4py.mb.GBTDAALModel(model)[source]

Gradient Boosted Decision Tree Model

Model class offering accelerated predictions for gradient-boosted decision tree models from other libraries.

Objects of this class are meant to be initialized from GBT model objects created through other libraries, returning a different class which can calculate predictions faster than the original library that created said model.

Can be created from model objects that meet all of the following criteria:

  • Were produced from one of the following libraries: xgboost, lightgbm, or catboost. It can work with either the base booster classes of those libraries or with their scikit-learn-compatible classes.

  • Do not use categorical features.

  • Are for regression or classification (e.g. no ranking). In the case of XGBoost objective binary:logitraw, it will create a classification model out of it, and in the case of objective reg:logistic, will create a regression model.

  • Are not multi-output models. Note that multi-class classification is supported.

Parameters:

model (booster object from another library) – The fitted GBT model from which this object will be created. See rest of the documentation for supported input types.

property is_classifier_: bool

Whether this is a classification model

property is_regressor_: bool

Whether this is a regression model

predict(X, pred_contribs: bool = False, pred_interactions: bool = False) ndarray[source]

Compute model predictions on new data

Computes the predicted values of the response variable for new data given the features / covariates for each row.

In the case of classification models, this will output the most probable class (see predict_proba() for probability predictions), while in the case of regression models, will output values in the link scale (what XGBoost calls ‘margin’ and LightGBM calls ‘raw’).

Parameters:
  • X – The features covariates. Should be an array of shape [num_samples, num_features].

  • pred_contribs (bool) – Whether to predict feature contributions. Result should have shape [num_samples, num_features+1], with the last column corresponding to the intercept. See xgboost.Booster.predict for more details about this type of computation.

  • pred_interactions (bool) – Whether to predict feature interactions. Result should have shape [num_samples, num_features+1, num_features+1], with the last position across the last two dimensions corresponding to the intercept. See xgboost.Booster.predict for more details about this type of computation.

Return type:

np.ndarray

predict_proba(X) ndarray[source]

Predict class probabilities

Computes the predicted probabilities of belonging to each class for each row in the input data given the features / covariates. Output shape is [num_samples, num_classes].

Parameters:

X – The features covariates. Should be an array of shape [num_samples, num_features].

Return type:

np.ndarray

class daal4py.mb.LogisticDAALModel(coefs, intercepts, dtype=<class 'numpy.float64'>)[source]

Logistic Regression Predictor

Creates a logistic regression or multionomial logistic regression model object which can calculate fast predictions of different types (classes, probabilities, logarithms of probabilities), from fitted coefficients and intercepts obtained elsewhere (such as from sklearn.linear_model.LogisticRegression), making the predictions either in double (np.float64) or single (np.float32) precision.

Parameters:
  • coefs (array(n_classes, n_features) or array(n_features,)) – The fitted model coefficients. Note that only dense arrays are supported. In the case of binary classification, can be passed as a 1D array or as a 2D array having a single row.

  • intercepts (array(n_classes) or float) – The fitted intercepts. In the case of binary classification, must be passed as either a scalar, or as a 1D array with a single entry.

  • dtype (np.float32 or np.float64) – The dtype to use for the object.

n_classes_

Number of classes in the model.

Type:

int

n_features_in_

Number of features in the model.

Type:

int

dtype_

The dtype of the model

Type:

np.dtype

coef_

The model coefficients

Type:

array(n_classes, n_features)

intercept_

The model intercepts

Type:

array(n_classes)

predict(X) ndarray[source]

Predict most probable class

Parameters:

X (array-like(n_samples, n_features)) – The features / covariates for each row. Can be passed as either a NumPy array or as a sparse CSR array/matrix from SciPy. For faster results, use the same dtype as what this object was built for.

Returns:

classes – The most probable class, as integer indexes

Return type:

array(n_samples,)

predict_log_proba(X) ndarray[source]

Predict log-probabilities of belonging to each class

Parameters:

X (array-like(n_samples, n_features)) – The features / covariates for each row. Can be passed as either a NumPy array or as a sparse CSR array/matrix from SciPy. For faster results, use the same dtype as what this object was built for.

Returns:

log_proba – The logarithms of the predicted probabilities for each class.

Return type:

array(n_samples, n_classes)

predict_multiple(X, classes: bool = True, proba: bool = True, log_proba: bool = True) classifier_prediction_result[source]

Make multiple prediction types at once

A method that can output the results from predict, predict_proba, and predict_log_proba all together in the same call more efficiently than computing them independently.

Parameters:
  • X (array-like(n_samples, n_features)) – The features / covariates for each row. Can be passed as either a NumPy array or as a sparse CSR array/matrix from SciPy. For faster results, use the same dtype as what this object was built for.

  • classes (bool) – Whether to output class predictions (what is obtained from predict()).

  • proba (bool) – Whether to output per-class probability predictions (what is obtained from predict_proba()).

  • log_proba (bool) – Whether to output per-class logarithms of probabilities (what is obtained from predict_log_proba()).

Returns:

predictions – An object of class daal4py.classifier_prediction_result with the requested prediction types for the same X data.

Return type:

classifier_prediction_result

predict_proba(X) ndarray[source]

Predict probabilities of belonging to each class

Parameters:

X (array-like(n_samples, n_features)) – The features / covariates for each row. Can be passed as either a NumPy array or as a sparse CSR array/matrix from SciPy. For faster results, use the same dtype as what this object was built for.

Returns:

proba – The predicted probabilities for each class.

Return type:

array(n_samples, n_classes)

Classification

Note

All classification algorithms produce a result object of the same class, containing predicted probabilities, logarithm of the predicted probabilities, and most probable class.

Results class

class daal4py.classifier_prediction_result

Properties:

logProbabilities

Numpy array

Type:

type

prediction

Numpy array

Type:

type

probabilities

Numpy array

Type:

type

Decision Forest Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification Decision Forest.

Examples:

class daal4py.decision_forest_classification_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Decision forest, double or float

  • method (str) – [optional, default: “defaultDense”] Decision forest computation method

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • nTrees (size_t) – [optional, default: -1] Number of trees in the forest. Default is 10

  • observationsPerTreeFraction (double) – [optional, default: get_nan64()] Fraction of observations used for a training of one tree, 0 to 1. Default is 1 (sampling with replacement)

  • featuresPerNode (size_t) – [optional, default: -1] Number of features tried as possible splits per node. If 0 then sqrt(p) for classification, p/3 for regression, where p is the total number of features.

  • maxTreeDepth (size_t) – [optional, default: -1] Maximal tree depth. Default is 0 (unlimited)

  • minObservationsInLeafNode (size_t) – [optional, default: -1] Minimal number of observations in a leaf node. Default is 1 for classification, 5 for regression.

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for the random numbers generator used by the algorithms

  • impurityThreshold (double) – [optional, default: get_nan64()] Threshold value used as stopping criteria: if the impurity value in the node is smaller than the threshold then the node is not split anymore.

  • varImportance (str) – [optional, default: “”] Variable importance computation mode

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • memorySavingMode (bool) – [optional, default: False] If true then use memory saving (but slower) mode

  • bootstrap (bool) – [optional, default: False] If true then training set for a tree is a bootstrap of the whole training set

  • minObservationsInSplitNode (size_t) – [optional, default: -1] Minimal number of observations in a split node. Default 2

  • minWeightFractionInLeafNode (double) – [optional, default: get_nan64()] The minimum weighted fraction of the sum total of weights (of all the input observations) required to be at a leaf node, 0.0 to 0.5. Default is 0.0

  • minImpurityDecreaseInSplitNode (double) – [optional, default: get_nan64()] A node will be split if this split induces a decrease of the impurity greater than or equal to the value, non-negative. Default is 0.0

  • maxLeafNodes (size_t) – [optional, default: -1] Maximum number of leaf node. Default is 0 (unlimited)

  • maxBins (size_t) – [optional, default: -1] Used with ‘hist’ split finding method only. Maximal number of discrete bins to bucket continuous features. Default is 256. Increasing the number results in higher computation costs

  • minBinSize (size_t) – [optional, default: -1] Used with ‘hist’ split finding method only. Minimal number of observations in a bin. Default is 5

  • splitter (str) – [optional, default: “”] Sets node splitting method. Default is best

  • binningStrategy (str) – [optional, default: “”] Used with ‘hist’ split finding method only. Selects the strategy to group data points into bins. Allowed values are ‘quantiles’ (default), ‘averages’

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

decision_forest_classification_training_result

class daal4py.decision_forest_classification_training_result

Properties:

model

decision_forest_classification_model

Type:

type

outOfBagError

Numpy array

Type:

type

outOfBagErrorAccuracy

Numpy array

Type:

type

outOfBagErrorDecisionFunction

Numpy array

Type:

type

outOfBagErrorPerObservation

Numpy array

Type:

type

variableImportance

Numpy array

Type:

type

class daal4py.decision_forest_classification_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the decision_forest algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] decision_forest computation method

  • votingMethod (str) – [optional, default: “”]

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (decision_forest_classification_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.decision_forest_classification_model

Properties:

NFeatures

size_t

Type:

type

NumberOfClasses

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfTrees

size_t

Type:

type

Decision Tree Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification Decision Tree.

Examples:

class daal4py.decision_tree_classification_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Decision tree model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] Decision tree training method

  • pruning (str) – [optional, default: “”] Pruning method for Decision tree

  • maxTreeDepth (size_t) – [optional, default: -1] Maximum tree depth. 0 means unlimited depth.

  • minObservationsInLeafNodes (size_t) – [optional, default: -1] Minimum number of observations in the leaf node. Can be any positive number.

  • nBins (size_t) – [optional, default: -1] The number of bins used to compute probabilities of the observations belonging to the class. The only supported value for current version of the library is 1.

  • splitCriterion (str) – [optional, default: “”] Split criterion for Decision tree classification

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, dataForPruning, labelsForPruning, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • dataForPruning (data_or_file) – Pruning data set

  • labelsForPruning (data_or_file) – Labels of the pruning data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

decision_tree_classification_training_result

class daal4py.decision_tree_classification_training_result

Properties:

model

decision_tree_classification_model

Type:

type

class daal4py.decision_tree_classification_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Decision tree model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

  • pruning (str) – [optional, default: “”] Pruning method for Decision tree

  • maxTreeDepth (size_t) – [optional, default: -1] Maximum tree depth. 0 means unlimited depth.

  • minObservationsInLeafNodes (size_t) – [optional, default: -1] Minimum number of observations in the leaf node. Can be any positive number.

  • nBins (size_t) – [optional, default: -1] The number of bins used to compute probabilities of the observations belonging to the class. The only supported value for current version of the library is 1.

  • splitCriterion (str) – [optional, default: “”] Split criterion for Decision tree classification

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (decision_tree_classification_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.decision_tree_classification_model

Properties:

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

Gradient Boosted Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification Gradient Boosted Tree.

Examples:

class daal4py.gbt_classification_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Gradient Boosted Trees, double or float

  • method (str) – [optional, default: “defaultDense”] Gradient Boosted Trees computation method

  • loss (str) – [optional, default: “”] Loss function type

  • varImportance (str) – [optional, default: “”] 64 bit integer flag VariableImportanceModes that indicates the variable importance computation modes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • splitMethod (str) – [optional, default: “”] Split finding method. Default is exact

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the gradient boosted trees training algorithm. Default is 50

  • maxTreeDepth (size_t) – [optional, default: -1] Maximal tree depth, 0 for unlimited. Default is 6

  • shrinkage (double) – [optional, default: get_nan64()] Learning rate of the boosting procedure. Scales the contribution of each tree by a factor (0, 1]. Default is 0.3

  • minSplitLoss (double) – [optional, default: get_nan64()] Loss regularization parameter. Min loss reduction required to make a further partition on a leaf node of the tree. Range: [0, inf). Default is 0

  • lambda (double) – [optional, default: get_nan64()] L2 regularization parameter on weights. Range: [0, inf). Default is 1

  • observationsPerTreeFraction (double) – [optional, default: get_nan64()] Fraction of observations used for a training of one tree, sampling without replacement. Range: (0, 1]. Default is 1 (no sampling, entire dataset is used)

  • featuresPerNode (size_t) – [optional, default: -1] Number of features tried as possible splits per node. Range : [0, p] where p is the total number of features. Default is 0 (use all features)

  • minObservationsInLeafNode (size_t) – [optional, default: -1] Minimal number of observations in a leaf node. Default is 5.

  • memorySavingMode (bool) – [optional, default: False] If true then use memory saving (but slower) mode. Default is false

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for the random numbers generator used by the algorithms

  • maxBins (size_t) – [optional, default: -1] Used with ‘inexact’ split finding method only. Maximal number of discrete bins to bucket continuous features. Default is 256. Increasing the number results in higher computation costs

  • minBinSize (size_t) – [optional, default: -1] Used with ‘inexact’ split finding method only. Minimal number of observations in a bin. Default is 5

  • internalOptions (int) – [optional, default: -1] Internal options

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

gbt_classification_training_result

class daal4py.gbt_classification_training_result

Properties:

model

gbt_classification_model

Type:

type

variableImportanceByCover

Numpy array

Type:

type

variableImportanceByGain

Numpy array

Type:

type

variableImportanceByTotalCover

Numpy array

Type:

type

variableImportanceByTotalGain

Numpy array

Type:

type

variableImportanceByWeight

Numpy array

Type:

type

class daal4py.gbt_classification_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the gbt algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] gradient boosted trees computation method

  • nIterations (size_t) – [optional, default: -1] Number of iterations of the trained model to be used for prediction

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (gbt_classification_modelptr) – Trained gradient boosted trees model

Return type:

gbt_classification_prediction_result

class daal4py.gbt_classification_model

Properties:

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfTrees

size_t

Type:

type

PredictionBias

double

Type:

type

k-Nearest Neighbors (kNN)

Parameters and semantics are described in oneAPI Data Analytics Library k-Nearest Neighbors (kNN).

Examples:

class daal4py.kdtree_knn_classification_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for KD-tree based kNN model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] KD-tree based kNN training method

  • k (size_t) – [optional, default: -1] Number of neighbors

  • dataUseInModel (str) – [optional, default: “”] The option to enable/disable an usage of the input dataset in kNN model

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random choosing elements from training dataset

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • voteWeights (str) – [optional, default: “”] Weight function used in prediction

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – [optional, default: None] Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

kdtree_knn_classification_training_result

class daal4py.kdtree_knn_classification_training_result

Properties:

model

kdtree_knn_classification_model

Type:

type

class daal4py.kdtree_knn_classification_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for KD-tree based kNN model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

  • k (size_t) – [optional, default: -1] Number of neighbors

  • dataUseInModel (str) – [optional, default: “”] The option to enable/disable an usage of the input dataset in kNN model

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random choosing elements from training dataset

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • voteWeights (str) – [optional, default: “”] Weight function used in prediction

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (kdtree_knn_classification_modelptr) – Input model trained by the classification algorithm

Return type:

kdtree_knn_classification_prediction_result

class daal4py.kdtree_knn_classification_model

Properties:

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

Brute-force k-Nearest Neighbors (kNN)

Parameters and semantics are described in oneAPI Data Analytics Library k-Nearest Neighbors (kNN).

class daal4py.bf_knn_classification_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for BF kNN model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] BF kNN training method

  • k (size_t) – [optional, default: -1] Number of neighbors

  • dataUseInModel (str) – [optional, default: “”] The option to enable/disable an usage of the input dataset in kNN model

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • voteWeights (str) – [optional, default: “”] Weight function used in prediction

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random choosing elements from training dataset

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – [optional, default: None] Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

bf_knn_classification_training_result

class daal4py.bf_knn_classification_training_result

Properties:

model

bf_knn_classification_model

Type:

type

class daal4py.bf_knn_classification_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for BF kNN model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

  • k (size_t) – [optional, default: -1] Number of neighbors

  • dataUseInModel (str) – [optional, default: “”] The option to enable/disable an usage of the input dataset in kNN model

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • voteWeights (str) – [optional, default: “”] Weight function used in prediction

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random choosing elements from training dataset

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (bf_knn_classification_modelptr) – Input model trained by the classification algorithm

Return type:

bf_knn_classification_prediction_result

class daal4py.bf_knn_classification_model

Properties:

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

AdaBoost Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification AdaBoost.

Examples:

class daal4py.adaboost_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the AdaBoost, double or float

  • method (str) – [optional, default: “defaultDense”] AdaBoost computation method

  • weakLearnerTraining (classifier_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (classifier_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the AdaBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the AdaBoost training algorithm

  • learningRate (double) – [optional, default: get_nan64()] Multiplier for each classifier to shrink its contribution

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

adaboost_training_result

class daal4py.adaboost_training_result

Properties:

model

adaboost_model

Type:

type

weakLearnersErrors

Numpy array

Type:

type

class daal4py.adaboost_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the AdaBoost, double or float

  • method (str) – [optional, default: “defaultDense”] AdaBoost computation method

  • weakLearnerTraining (classifier_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (classifier_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the AdaBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the AdaBoost training algorithm

  • learningRate (double) – [optional, default: get_nan64()] Multiplier for each classifier to shrink its contribution

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (adaboost_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.adaboost_model

Properties:

Alpha

Numpy array

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfWeakLearners

size_t

Type:

type

WeakLearnerModel(idx)
Type:

classifier_model (or derived)

BrownBoost Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification BrownBoost.

Examples:

class daal4py.brownboost_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for BrownBoost, double or float

  • method (str) – [optional, default: “defaultDense”] BrownBoost computation method

  • weakLearnerTraining (classifier_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (classifier_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the BrownBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the BrownBoost training algorithm

  • newtonRaphsonAccuracyThreshold (double) – [optional, default: get_nan64()] Accuracy threshold for Newton-Raphson iterations in the BrownBoost training algorithm

  • newtonRaphsonMaxIterations (size_t) – [optional, default: -1] Maximal number of Newton-Raphson iterations in the BrownBoost training algorithm

  • degenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold needed to avoid degenerate cases in the BrownBoost training algorithm

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

brownboost_training_result

class daal4py.brownboost_training_result

Properties:

model

brownboost_model

Type:

type

class daal4py.brownboost_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the BrownBoost algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] BrownBoost computation method

  • weakLearnerTraining (classifier_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (classifier_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the BrownBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the BrownBoost training algorithm

  • newtonRaphsonAccuracyThreshold (double) – [optional, default: get_nan64()] Accuracy threshold for Newton-Raphson iterations in the BrownBoost training algorithm

  • newtonRaphsonMaxIterations (size_t) – [optional, default: -1] Maximal number of Newton-Raphson iterations in the BrownBoost training algorithm

  • degenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold needed to avoid degenerate cases in the BrownBoost training algorithm

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (brownboost_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.brownboost_model

Properties:

Alpha

Numpy array

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfWeakLearners

size_t

Type:

type

WeakLearnerModel(idx)
Type:

classifier_model (or derived)

LogitBoost Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification LogitBoost.

Examples:

class daal4py.logitboost_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for LogitBoost, double or float

  • method (str) – [optional, default: “friedman”] LogitBoost computation method

  • weakLearnerTraining (regression_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (regression_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the LogitBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of terms in additive regression

  • weightsDegenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold to avoid degenerate cases when calculating weights W

  • responsesDegenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold to avoid degenerate cases when calculating responses Z

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

logitboost_training_result

class daal4py.logitboost_training_result

Properties:

model

logitboost_model

Type:

type

class daal4py.logitboost_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the LogitBoost algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] LogitBoost computation method

  • weakLearnerTraining (regression_training_batch__iface__) – [optional, default: None] The algorithm for weak learner model training

  • weakLearnerPrediction (regression_prediction_batch__iface__) – [optional, default: None] The algorithm for prediction based on a weak learner model

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the LogitBoost training algorithm

  • maxIterations (size_t) – [optional, default: -1] Maximal number of terms in additive regression

  • weightsDegenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold to avoid degenerate cases when calculating weights W

  • responsesDegenerateCasesThreshold (double) – [optional, default: get_nan64()] Threshold to avoid degenerate cases when calculating responses Z

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (logitboost_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.logitboost_model

Properties:

Iterations

size_t

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfWeakLearners

size_t

Type:

type

WeakLearnerModel(idx)
Type:

regression_model (or derived)

Stump Weak Learner Classification

Parameters and semantics are described in oneAPI Data Analytics Library Classification Weak Learner Stump.

Examples:

class daal4py.stump_classification_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the the decision stump training method, double or float

  • method (str) – [optional, default: “defaultDense”] Decision stump training method

  • splitCriterion (str) – [optional, default: “”] Split criterion for stump classification

  • varImportance (str) – [optional, default: “”] Variable importance computation mode

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

stump_classification_training_result

class daal4py.stump_classification_training_result

Properties:

model

stump_classification_model

Type:

type

variableImportance

Numpy array

Type:

type

class daal4py.stump_classification_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the decision stump prediction algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Decision stump model-based prediction method

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (stump_classification_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.stump_classification_model

Properties:

LeftValue

double

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

RightValue

double

Type:

type

SplitFeature

size_t

Type:

type

SplitValue

double

Type:

type

Multinomial Naive Bayes

Parameters and semantics are described in oneAPI Data Analytics Library Naive Bayes.

Examples:

class daal4py.multinomial_naive_bayes_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for multinomial naive Bayes training, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method

  • priorClassEstimates (array) – [optional, default: None] Prior class estimates

  • alpha (array) – [optional, default: None] Imagined occurrences of the each word

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

multinomial_naive_bayes_training_result

class daal4py.multinomial_naive_bayes_training_result

Properties:

model

multinomial_naive_bayes_model

Type:

type

class daal4py.multinomial_naive_bayes_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for prediction based on the multinomial naive Bayes model, double or float

  • method (str) – [optional, default: “defaultDense”] Multinomial naive Bayes prediction method

  • priorClassEstimates (array) – [optional, default: None] Prior class estimates

  • alpha (array) – [optional, default: None] Imagined occurrences of the each word

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (multinomial_naive_bayes_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.multinomial_naive_bayes_model

Properties:

AuxTable

Numpy array

Type:

type

LogP

Numpy array

Type:

type

LogTheta

Numpy array

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

Support Vector Machine (SVM)

Parameters and semantics are described in oneAPI Data Analytics Library SVM.

Note: For the labels parameter, data is formatted as -1s and 1s

Examples:

class daal4py.svm_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the SVM training algorithm, double or float

  • method (str) – [optional, default: “boser”] SVM training method

  • C (double) – [optional, default: get_nan64()] Upper bound in constraints of the quadratic optimization problem

  • accuracyThreshold (double) – [optional, default: get_nan64()] Training accuracy

  • tau (double) – [optional, default: get_nan64()] Tau parameter of the working set selection scheme

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations for the algorithm

  • cacheSize (size_t) – [optional, default: -1] Size of cache in bytes to store values of the kernel matrix. A non-zero value enables use of a cache optimization technique

  • doShrinking (bool) – [optional, default: False] Flag that enables use of the shrinking optimization technique

  • shrinkingStep (size_t) – [optional, default: -1] Number of iterations between the steps of shrinking optimization technique

  • kernel (kernel_function_kerneliface__iface__) – [optional, default: None] Kernel function

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

svm_training_result

class daal4py.svm_training_result

Properties:

model

svm_model

Type:

type

class daal4py.svm_prediction
Parameters:
  • fptype (str) – [optional, default: “double”]

  • method (str) – [optional, default: “defaultDense”]

  • C (double) – [optional, default: get_nan64()] Upper bound in constraints of the quadratic optimization problem

  • accuracyThreshold (double) – [optional, default: get_nan64()] Training accuracy

  • tau (double) – [optional, default: get_nan64()] Tau parameter of the working set selection scheme

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations for the algorithm

  • cacheSize (size_t) – [optional, default: -1] Size of cache in bytes to store values of the kernel matrix. A non-zero value enables use of a cache optimization technique

  • doShrinking (bool) – [optional, default: False] Flag that enables use of the shrinking optimization technique

  • shrinkingStep (size_t) – [optional, default: -1] Number of iterations between the steps of shrinking optimization technique

  • kernel (kernel_function_kerneliface__iface__) – [optional, default: None] Kernel function

  • nClasses (size_t) – [optional, default: -1] Number of classes

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (svm_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.svm_model

Properties:

Bias

double

Type:

type

ClassificationCoefficients

Numpy array

Type:

type

NFeatures

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

SupportIndices

Numpy array

Type:

type

SupportVectors

Numpy array

Type:

type

Logistic Regression

Parameters and semantics are described in oneAPI Data Analytics Library Logistic Regression.

Examples:

class daal4py.logistic_regression_training
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for logistic regression, double or float

  • method (str) – [optional, default: “defaultDense”] logistic regression computation method

  • interceptFlag (bool) – [optional, default: False] Whether the intercept needs to be computed

  • penaltyL1 (float) – [optional, default: get_nan32()] L1 regularization coefficient. Default is 0 (not applied)

  • penaltyL2 (float) – [optional, default: get_nan32()] L2 regularization coefficient. Default is 0 (not applied)

  • optimizationSolver (optimization_solver_iterative_solver_batch__iface__) – [optional, default: None] Default is sgd momentum solver

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, labels, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Training data set

  • labels (data_or_file) – Labels of the training data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

logistic_regression_training_result

class daal4py.logistic_regression_training_result

Properties:

model

logistic_regression_model

Type:

type

class daal4py.logistic_regression_prediction
Parameters:
  • nClasses (size_t) – Number of classes

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the logistic regression algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] logistic regression computation method

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data set

  • model (logistic_regression_modelptr) – Input model trained by the classification algorithm

Return type:

classifier_prediction_result

class daal4py.logistic_regression_model

Properties:

Beta

Numpy array

Type:

type

InterceptFlag

bool

Type:

type

NFeatures

size_t

Type:

type

NumberOfBetas

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

Regression

Decision Forest Regression

Parameters and semantics are described in oneAPI Data Analytics Library Regression Decision Forest.

Examples:

class daal4py.decision_forest_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for decision forest model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] decision forest training method

  • nTrees (size_t) – [optional, default: -1] Number of trees in the forest. Default is 10

  • observationsPerTreeFraction (double) – [optional, default: get_nan64()] Fraction of observations used for a training of one tree, 0 to 1. Default is 1 (sampling with replacement)

  • featuresPerNode (size_t) – [optional, default: -1] Number of features tried as possible splits per node. If 0 then sqrt(p) for classification, p/3 for regression, where p is the total number of features.

  • maxTreeDepth (size_t) – [optional, default: -1] Maximal tree depth. Default is 0 (unlimited)

  • minObservationsInLeafNode (size_t) – [optional, default: -1] Minimal number of observations in a leaf node. Default is 1 for classification, 5 for regression.

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for the random numbers generator used by the algorithms

  • impurityThreshold (double) – [optional, default: get_nan64()] Threshold value used as stopping criteria: if the impurity value in the node is smaller than the threshold then the node is not split anymore.

  • varImportance (str) – [optional, default: “”] Variable importance computation mode

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • memorySavingMode (bool) – [optional, default: False] If true then use memory saving (but slower) mode

  • bootstrap (bool) – [optional, default: False] If true then training set for a tree is a bootstrap of the whole training set

  • minObservationsInSplitNode (size_t) – [optional, default: -1] Minimal number of observations in a split node. Default 2

  • minWeightFractionInLeafNode (double) – [optional, default: get_nan64()] The minimum weighted fraction of the sum total of weights (of all the input observations) required to be at a leaf node, 0.0 to 0.5. Default is 0.0

  • minImpurityDecreaseInSplitNode (double) – [optional, default: get_nan64()] A node will be split if this split induces a decrease of the impurity greater than or equal to the value, non-negative. Default is 0.0

  • maxLeafNodes (size_t) – [optional, default: -1] Maximum number of leaf node. Default is 0 (unlimited)

  • maxBins (size_t) – [optional, default: -1] Used with ‘hist’ split finding method only. Maximal number of discrete bins to bucket continuous features. Default is 256. Increasing the number results in higher computation costs

  • minBinSize (size_t) – [optional, default: -1] Used with ‘hist’ split finding method only. Minimal number of observations in a bin. Default is 5

  • splitter (str) – [optional, default: “”] Sets node splitting method. Default is best

  • binningStrategy (str) – [optional, default: “”] Used with ‘hist’ split finding method only. Selects the strategy to group data points into bins. Allowed values are ‘quantiles’ (default), ‘averages’

compute(data, dependentVariable, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariable (data_or_file) – Values of the dependent variable for the input data

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

decision_forest_regression_training_result

class daal4py.decision_forest_regression_training_result

Properties:

model

decision_forest_regression_model

Type:

type

outOfBagError

Numpy array

Type:

type

outOfBagErrorPerObservation

Numpy array

Type:

type

outOfBagErrorPrediction

Numpy array

Type:

type

outOfBagErrorR2

Numpy array

Type:

type

variableImportance

Numpy array

Type:

type

class daal4py.decision_forest_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for decision forest model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (decision_forest_regression_modelptr) – Trained decision tree model

Return type:

decision_forest_regression_prediction_result

class daal4py.decision_forest_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.decision_forest_regression_model

Properties:

NumberOfFeatures

size_t

Type:

type

NumberOfTrees

size_t

Type:

type

Decision Tree Regression

Parameters and semantics are described in oneAPI Data Analytics Library Regression Decision Tree.

Examples:

class daal4py.decision_tree_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Decision tree model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] Decision tree training method

  • pruning (str) – [optional, default: “”] Pruning method for Decision tree

  • maxTreeDepth (size_t) – [optional, default: -1] Maximum tree depth. 0 means unlimited depth.

  • minObservationsInLeafNodes (size_t) – [optional, default: -1] Minimum number of observations in the leaf node. Can be any positive number.

compute(data, dependentVariables, dataForPruning, dependentVariablesForPruning, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariables (data_or_file) – Values of the dependent variable for the input data

  • dataForPruning (data_or_file) – Pruning data set

  • dependentVariablesForPruning (data_or_file) – Labels of the pruning data set

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set

Return type:

decision_tree_regression_training_result

class daal4py.decision_tree_regression_training_result

Properties:

model

decision_tree_regression_model

Type:

type

class daal4py.decision_tree_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for Decision tree model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

  • pruning (str) – [optional, default: “”] Pruning method for Decision tree

  • maxTreeDepth (size_t) – [optional, default: -1] Maximum tree depth. 0 means unlimited depth.

  • minObservationsInLeafNodes (size_t) – [optional, default: -1] Minimum number of observations in the leaf node. Can be any positive number.

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (decision_tree_regression_modelptr) – Trained decision tree model

Return type:

decision_tree_regression_prediction_result

class daal4py.decision_tree_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.decision_tree_regression_model

Properties:

NumberOfFeatures

size_t

Type:

type

Gradient Boosted Regression

Parameters and semantics are described in oneAPI Data Analytics Library Regression Gradient Boosted Tree.

Examples:

class daal4py.gbt_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] gradient boosted trees training method

  • loss (str) – [optional, default: “”] Loss function type

  • varImportance (str) – [optional, default: “”] 64 bit integer flag VariableImportanceModes that indicates the variable importance computation modes

  • splitMethod (str) – [optional, default: “”] Split finding method. Default is exact

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the gradient boosted trees training algorithm. Default is 50

  • maxTreeDepth (size_t) – [optional, default: -1] Maximal tree depth, 0 for unlimited. Default is 6

  • shrinkage (double) – [optional, default: get_nan64()] Learning rate of the boosting procedure. Scales the contribution of each tree by a factor (0, 1]. Default is 0.3

  • minSplitLoss (double) – [optional, default: get_nan64()] Loss regularization parameter. Min loss reduction required to make a further partition on a leaf node of the tree. Range: [0, inf). Default is 0

  • lambda (double) – [optional, default: get_nan64()] L2 regularization parameter on weights. Range: [0, inf). Default is 1

  • observationsPerTreeFraction (double) – [optional, default: get_nan64()] Fraction of observations used for a training of one tree, sampling without replacement. Range: (0, 1]. Default is 1 (no sampling, entire dataset is used)

  • featuresPerNode (size_t) – [optional, default: -1] Number of features tried as possible splits per node. Range : [0, p] where p is the total number of features. Default is 0 (use all features)

  • minObservationsInLeafNode (size_t) – [optional, default: -1] Minimal number of observations in a leaf node. Default is 5.

  • memorySavingMode (bool) – [optional, default: False] If true then use memory saving (but slower) mode. Default is false

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for the random numbers generator used by the algorithms

  • maxBins (size_t) – [optional, default: -1] Used with ‘inexact’ split finding method only. Maximal number of discrete bins to bucket continuous features. Default is 256. Increasing the number results in higher computation costs

  • minBinSize (size_t) – [optional, default: -1] Used with ‘inexact’ split finding method only. Minimal number of observations in a bin. Default is 5

  • internalOptions (int) – [optional, default: -1] Internal options

compute(data, dependentVariable)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariable (data_or_file) – Values of the dependent variable for the input data

Return type:

gbt_regression_training_result

class daal4py.gbt_regression_training_result

Properties:

model

gbt_regression_model

Type:

type

variableImportanceByCover

Numpy array

Type:

type

variableImportanceByGain

Numpy array

Type:

type

variableImportanceByTotalCover

Numpy array

Type:

type

variableImportanceByTotalGain

Numpy array

Type:

type

variableImportanceByWeight

Numpy array

Type:

type

class daal4py.gbt_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

  • nIterations (size_t) – [optional, default: -1] Number of iterations of the trained model to be uses for prediction

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (gbt_regression_modelptr) – Trained gradient boosted trees model

Return type:

gbt_regression_prediction_result

class daal4py.gbt_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.gbt_regression_model

Properties:

NumberOfFeatures

size_t

Type:

type

NumberOfTrees

size_t

Type:

type

PredictionBias

double

Type:

type

Linear Regression

Parameters and semantics are described in oneAPI Data Analytics Library Linear Regression.

Examples:

class daal4py.linear_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for linear regression model-based training, double or float

  • method (str) – [optional, default: “normEqDense”] Linear regression training method

  • interceptFlag (bool) – [optional, default: False] Flag that indicates whether the intercept needs to be computed

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data, dependentVariables)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariables (data_or_file) – Values of the dependent variable for the input data

Return type:

linear_regression_training_result

class daal4py.linear_regression_training_result

Properties:

model

linear_regression_model

Type:

type

class daal4py.linear_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for linear regression model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (linear_regression_modelptr) – Trained linear regression model

Return type:

linear_regression_prediction_result

class daal4py.linear_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.linear_regression_model

Properties:

Beta

Numpy array

Type:

type

InterceptFlag

bool

Type:

type

NumberOfBetas

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfResponses

size_t

Type:

type

LASSO Regression

Parameters and semantics are described in oneAPI Data Analytics Library Least Absolute Shrinkage and Selection Operator.

Examples:

class daal4py.lasso_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for lasso regression model-based training, double or float

  • method (str) – [optional, default: “defaultDense”] LASSO regression training method

  • lassoParameters (array) – [optional, default: None] Numeric table that contains values of lasso parameters

  • optimizationSolver (optimization_solver_iterative_solver_batch__iface__) – [optional, default: None] Default is coordinate descent solver

  • dataUseInComputation (str) – [optional, default: “”] The flag allows to corrupt input data

  • optResultToCompute (str) – [optional, default: “”] 64 bit integer flag that indicates the optional results to compute

  • interceptFlag (bool) – [optional, default: False] Flag that indicates whether the intercept needs to be computed

compute(data, dependentVariables, weights, gramMatrix)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariables (data_or_file) – Values of the dependent variable for the input data

  • weights (data_or_file) – [optional, default: None] NumericTable of size 1 x n with weights of samples. Applied for all method

  • gramMatrix (data_or_file) – [optional, default: None] NumericTable of size p x p with last iteration number. Applied for all method

Return type:

lasso_regression_training_result

class daal4py.lasso_regression_training_result

Properties:

gramMatrixId

Numpy array

Type:

type

model

lasso_regression_model

Type:

type

class daal4py.lasso_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for lasso regression model-based prediction

  • method (str) – [optional, default: “defaultDense”]

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (lasso_regression_modelptr) – Trained lasso regression model

Return type:

lasso_regression_prediction_result

class daal4py.lasso_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.lasso_regression_model

Properties:

Beta

Numpy array

Type:

type

InterceptFlag

bool

Type:

type

NumberOfBetas

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfResponses

size_t

Type:

type

Ridge Regression

Parameters and semantics are described in oneAPI Data Analytics Library Ridge Regression.

Examples:

class daal4py.ridge_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for ridge regression model-based training, double or float

  • method (str) – [optional, default: “normEqDense”] Ridge regression training method

  • ridgeParameters (array) – [optional, default: None] Numeric table that contains values of ridge parameters

  • interceptFlag (bool) – [optional, default: False] Flag that indicates whether the intercept needs to be computed

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data, dependentVariables)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariables (data_or_file) – Values of the dependent variable for the input data

Return type:

ridge_regression_training_result

class daal4py.ridge_regression_training_result

Properties:

model

ridge_regression_model

Type:

type

class daal4py.ridge_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for ridge regression model-based prediction

  • method (str) – [optional, default: “defaultDense”] Computation method in the batch processing mode

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (ridge_regression_modelptr) – Trained ridge regression model

Return type:

ridge_regression_prediction_result

class daal4py.ridge_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.ridge_regression_model

Properties:

Beta

Numpy array

Type:

type

InterceptFlag

bool

Type:

type

NumberOfBetas

size_t

Type:

type

NumberOfFeatures

size_t

Type:

type

NumberOfResponses

size_t

Type:

type

Stump Regression

Parameters and semantics are described in oneAPI Data Analytics Library Regression Stump.

Examples:

class daal4py.stump_regression_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the the decision stump training method, double or float

  • method (str) – [optional, default: “defaultDense”] Decision stump training method

  • varImportance (str) – [optional, default: “”] Variable importance mode. Variable importance computation is not supported for current version of the library

compute(data, dependentVariables, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • dependentVariables (data_or_file) – Values of the dependent variable for the input data

  • weights (data_or_file) – [optional, default: None] Optional. Weights of the observations in the training data set. Some values are skipped for backward compatibility.

Return type:

stump_regression_training_result

class daal4py.stump_regression_training_result

Properties:

model

stump_regression_model

Type:

type

variableImportance

Numpy array

Type:

type

class daal4py.stump_regression_prediction
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the decision stump prediction algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Decision stump model-based prediction method

  • varImportance (str) – [optional, default: “”] Variable importance mode. Variable importance computation is not supported for current version of the library

compute(data, model)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • model (stump_regression_modelptr) – Trained regression model

Return type:

stump_regression_prediction_result

class daal4py.stump_regression_prediction_result

Properties:

prediction

Numpy array

Type:

type

class daal4py.stump_regression_model

Properties:

LeftValue

double

Type:

type

NumberOfFeatures

size_t

Type:

type

RightValue

double

Type:

type

SplitFeature

size_t

Type:

type

SplitValue

double

Type:

type

Clustering

K-Means Clustering

Parameters and semantics are described in oneAPI Data Analytics Library K-Means Clustering.

Examples:

K-Means Initialization

Parameters and semantics are described in oneAPI Data Analytics Library K-Means Initialization.

class daal4py.kmeans_init
Parameters:
  • nClusters (size_t) – Number of clusters

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of initial clusters for K-Means algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Method of computing initial clusters for the algorithm

  • nTrials (size_t) – [optional, default: -1] Kmeans++ only. The number of trials to generate all clusters but the first initial cluster.

  • oversamplingFactor (double) – [optional, default: get_nan64()] Kmeans|| only. A fraction of nClusters being chosen in each of nRounds of kmeans||.L = nClusters* oversamplingFactor points are sampled in a round.

  • nRounds (size_t) – [optional, default: -1] Kmeans|| only. Number of rounds for k-means||. (oversamplingFactor*nRounds) > 1 is a requirement.

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine to be used for generating random numbers for the initialization

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

kmeans_init_result

class daal4py.kmeans_init_result

Properties:

centroids

Numpy array

Type:

type

K-Means

Parameters and semantics are described in oneAPI Data Analytics Library K-Means Computation.

class daal4py.kmeans
Parameters:
  • nClusters (size_t) – Number of clusters

  • maxIterations (size_t) – Number of iterations

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of K-Means, double or float

  • method (str) – [optional, default: “lloydDense”] Computation method of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Threshold for the termination of the algorithm

  • gamma (double) – [optional, default: get_nan64()] Weight used in distance computation for categorical features

  • distanceType (str) – [optional, default: “”] Distance used in the algorithm

  • resultsToEvaluate (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • assignFlag (bool) – [optional, default: False] Do data points assignment

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

compute(data, inputCentroids)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • inputCentroids (data_or_file) – Initial centroids for the algorithm

Return type:

kmeans_result

class daal4py.kmeans_result

Properties:

assignments

Numpy array

Type:

type

centroids

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

objectiveFunction

Numpy array

Type:

type

DBSCAN

Parameters and semantics are described in oneAPI Data Analytics Library Density-Based Spatial Clustering of Applications with Noise.

Examples:

class daal4py.dbscan
Parameters:
  • epsilon (double) – Radius of neighborhood

  • minObservations (size_t) – Minimal total weight of observations in neighborhood of core observation

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of DBSCAN, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the algorithm

  • memorySavingMode (bool) – [optional, default: False] If true then use memory saving (but slower) mode

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • blockIndex (size_t) – [optional, default: -1] Unique identifier of block initially passed for computation on the local node

  • nBlocks (size_t) – [optional, default: -1] Number of blocks initially passed for computation on all nodes

  • leftBlocks (size_t) – [optional, default: -1] Number of blocks that will process observations with value of selected split feature lesser than selected split value

  • rightBlocks (size_t) – [optional, default: -1] Number of blocks that will process observations with value of selected split feature greater than selected split value

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

compute(data, weights)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • weights (data_or_file) – [optional, default: None] Input weights of observations

Return type:

dbscan_result

class daal4py.dbscan_result

Properties:

assignments

Numpy array

Type:

type

coreIndices

Numpy array

Type:

type

coreObservations

Numpy array

Type:

type

nClusters

Numpy array

Type:

type

Gaussian Mixtures

Parameters and semantics are described in oneAPI Data Analytics Library Expectation-Maximization.

Initialization for the Gaussian Mixture Model

Parameters and semantics are described in oneAPI Data Analytics Library Expectation-Maximization Initialization.

Examples:

class daal4py.em_gmm_init
Parameters:
  • nComponents (size_t) – Number of components in the Gaussian mixture model

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of initial values for the EM for GMM algorithm, double or float

  • method (str) – [optional, default: “defaultDense”]

  • nTrials (size_t) – [optional, default: -1] Number of trials of short EM runs

  • nIterations (size_t) – [optional, default: -1] Number of iterations in every short EM run

  • accuracyThreshold (double) – [optional, default: get_nan64()] Threshold for the termination of the algorithm

  • covarianceStorage (str) – [optional, default: “”] Type of covariance in the Gaussian mixture model.

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine to be used for randomly generating data points to start the initialization of short EM

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

em_gmm_init_result

class daal4py.em_gmm_init_result

Properties:

covariances

Numpy array

Type:

type

means

Numpy array

Type:

type

weights

Numpy array

Type:

type

EM algorithm for the Gaussian Mixture Model

Parameters and semantics are described in oneAPI Data Analytics Library Expectation-Maximization for the Gaussian Mixture Model.

Examples:

class daal4py.em_gmm
Parameters:
  • nComponents (size_t) – Number of components in the Gaussian mixture model

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the EM for GMM algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] EM for GMM computation method

  • maxIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm.

  • accuracyThreshold (double) – [optional, default: get_nan64()] Threshold for the termination of the algorithm.

  • regularizationFactor (double) – [optional, default: get_nan64()] Factor for covariance regularization in case of ill-conditional data

  • covarianceStorage (str) – [optional, default: “”] Type of covariance in the Gaussian mixture model.

compute(data, inputWeights, inputMeans, inputCovariances)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • inputWeights (data_or_file) – Input weights

  • inputMeans (data_or_file) – Input means

  • inputCovariances (list_numerictableptr) – Collection of input covariances

Return type:

em_gmm_result

class daal4py.em_gmm_result

Properties:

covariances

Numpy array

Type:

type

goalFunction

Numpy array

Type:

type

means

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

weights

Numpy array

Type:

type

Dimensionality reduction

Principal Component Analysis (PCA)

Parameters and semantics are described in oneAPI Data Analytics Library PCA.

Examples:

class daal4py.pca
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for PCA, double or float

  • method (str) – [optional, default: “correlationDense”] PCA computation method

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • nComponents (size_t) – [optional, default: -1] number of components for reduced implementation (applicable for batch mode only)

  • isDeterministic (bool) – [optional, default: False] sign flip if required

  • doScale (bool) – [optional, default: False] scaling if required

  • isCorrelation (bool) – [optional, default: False] correlation is provided

  • normalization (normalization_zscore_batchimpl__iface__) – [optional, default: None] Pointer to batch covariance

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

compute(data, correlation)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • correlation (data_or_file) – [optional, default: None] Input correlation table

Return type:

pca_result

class daal4py.pca_result

Properties:

dataForTransform

Numpy array

Type:

type

eigenvalues

Numpy array

Type:

type

eigenvectors

Numpy array

Type:

type

means

Numpy array

Type:

type

variances

Numpy array

Type:

type

Principal Component Analysis (PCA) Transform

Parameters and semantics are described in oneAPI Data Analytics Library PCA Transform.

Examples:

class daal4py.pca_transform
Parameters:
  • fptype (str) – [optional, default: “double”]

  • method (str) – [optional, default: “defaultDense”]

  • nComponents (size_t) – [optional, default: -1]

compute(data, eigenvectors, dataForTransform)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • eigenvectors (data_or_file) – Transformation matrix of eigenvectors

  • dataForTransform (dict_numerictableptr) – Data for transform

Return type:

pca_transform_result

class daal4py.pca_transform_result

Properties:

transformedData

Numpy array

Type:

type

Outlier detection

Multivariate Outlier Detection

Parameters and semantics are described in oneAPI Data Analytics Library Multivariate Outlier Detection.

Examples:

class daal4py.multivariate_outlier_detection
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the multivariate outlier detection, double or float

  • method (str) – [optional, default: “defaultDense”] Multivariate outlier detection computation method

compute(data, location, scatter, threshold)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • location (data_or_file) – [optional, default: None] Vector of mean estimates of size 1 x p

  • scatter (data_or_file) – [optional, default: None] Measure of spread, the variance-covariance matrix of size p x p

  • threshold (data_or_file) – [optional, default: None] Limit that defines the outlier region, the array of size 1 x 1 containing a non-negative number

Return type:

multivariate_outlier_detection_result

class daal4py.multivariate_outlier_detection_result

Properties:

weights

Numpy array

Type:

type

Univariate Outlier Detection

Parameters and semantics are described in oneAPI Data Analytics Library Univariate Outlier Detection.

Examples:

class daal4py.univariate_outlier_detection
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the univariate outlier detection algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] univariate outlier detection computation method

compute(data, location, scatter, threshold)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table

  • location (data_or_file) – [optional, default: None] Vector of mean estimates of size 1 x p

  • scatter (data_or_file) – [optional, default: None] Measure of spread, the array of standard deviations of size 1 x p

  • threshold (data_or_file) – [optional, default: None] Limit that defines the outlier region, the array of non-negative numbers of size 1 x p

Return type:

univariate_outlier_detection_result

class daal4py.univariate_outlier_detection_result

Properties:

weights

Numpy array

Type:

type

Multivariate Bacon Outlier Detection

Parameters and semantics are described in oneAPI Data Analytics Library Multivariate Bacon Outlier Detection.

Examples:

class daal4py.bacon_outlier_detection
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the BACON outlier detection, double or float

  • method (str) – [optional, default: “defaultDense”] BACON outlier detection computation method

  • initMethod (str) – [optional, default: “”] Initialization method

  • alpha (double) – [optional, default: get_nan64()] One-tailed probability that defines the (1 - lpha) quantile of the chi^2 distribution with p degrees of freedom. Recommended value: lpha / n, where n is the number of observations.

  • toleranceToConverge (double) – [optional, default: get_nan64()] Stopping criterion: the algorithm is terminated if the size of the basic subset is changed by less than the threshold

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

bacon_outlier_detection_result

class daal4py.bacon_outlier_detection_result

Properties:

weights

Numpy array

Type:

type

Optimization Solvers

Objective Functions

Mean Squared Error Algorithm (MSE)

Parameters and semantics are described in oneAPI Data Analytics Library MSE.

Examples:

class daal4py.optimization_solver_mse
Parameters:
  • numberOfTerms (size_t) – The number of terms in the function

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Mean squared error objective function, double or float

  • method (str) – [optional, default: “defaultDense”] The Mean squared error objective function computation method

  • interceptFlag (bool) – [optional, default: False] Whether the intercept needs to be computed. Default is true

  • penaltyL1 (array) – [optional, default: None] L1 regularization coefficients. Default is 0 (not applied)

  • penaltyL2 (array) – [optional, default: None] L2 regularization coefficients. Default is 0 (not applied)

  • batchIndices (array) – [optional, default: None] Numeric table of size 1 x m where m is batch size that represent a batch of indices used to compute the function results, e.g., value of the sum of the functions. If no indices are provided, all terms will be used in the computations.

  • featureId (size_t) – [optional, default: -1] The feature index to compute part of gradient/hessian/proximal projection

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, dependentVariables, argument, weights, gramMatrix)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

  • weights (data_or_file) – NumericTable of size 1 x n with samples weights. Applied for all method

  • gramMatrix (data_or_file) – NumericTable of size p x p with last iteration number. Applied for all method

Return type:

optimization_solver_objective_function_result

setup(data, dependentVariables, argument, weights, gramMatrix)

Setup (partial) input data for using algorithm object in other algorithms.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

  • weights (data_or_file) – NumericTable of size 1 x n with samples weights. Applied for all method

  • gramMatrix (data_or_file) – NumericTable of size p x p with last iteration number. Applied for all method

Return type:

None

daal4py.optimization_solver_mse_result

alias of optimization_solver_objective_function_result

Logistic Loss

Parameters and semantics are described in oneAPI Data Analytics Library Logistic Loss.

Examples:

class daal4py.optimization_solver_logistic_loss
Parameters:
  • numberOfTerms (size_t) – The number of terms in the function

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Logistic loss objective function, double or float

  • method (str) – [optional, default: “defaultDense”] The Logistic loss objective function computation method

  • interceptFlag (bool) – [optional, default: False] Whether the intercept needs to be computed. Default is true

  • penaltyL1 (float) – [optional, default: get_nan32()] L1 regularization coefficient. Default is 0 (not applied)

  • penaltyL2 (float) – [optional, default: get_nan32()] L2 regularization coefficient. Default is 0 (not applied)

  • batchIndices (array) – [optional, default: None] Numeric table of size 1 x m where m is batch size that represent a batch of indices used to compute the function results, e.g., value of the sum of the functions. If no indices are provided, all terms will be used in the computations.

  • featureId (size_t) – [optional, default: -1] The feature index to compute part of gradient/hessian/proximal projection

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, dependentVariables, argument)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

Return type:

optimization_solver_objective_function_result

setup(data, dependentVariables, argument)

Setup (partial) input data for using algorithm object in other algorithms.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

Return type:

None

daal4py.optimization_solver_logistic_loss_result

alias of optimization_solver_objective_function_result

Cross-entropy Loss

Parameters and semantics are described in oneAPI Data Analytics Library Cross Entropy Loss.

Examples:

class daal4py.optimization_solver_cross_entropy_loss
Parameters:
  • nClasses (size_t) – Number of classes (different values of dependent variable)

  • numberOfTerms (size_t) – The number of terms in the function

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Cross-entropy loss objective function, double or float

  • method (str) – [optional, default: “defaultDense”] The Cross-entropy loss objective function computation method

  • interceptFlag (bool) – [optional, default: False] Whether the intercept needs to be computed. Default is true

  • penaltyL1 (float) – [optional, default: get_nan32()] L1 regularization coefficient. Default is 0 (not applied)

  • penaltyL2 (float) – [optional, default: get_nan32()] L2 regularization coefficient. Default is 0 (not applied)

  • batchIndices (array) – [optional, default: None] Numeric table of size 1 x m where m is batch size that represent a batch of indices used to compute the function results, e.g., value of the sum of the functions. If no indices are provided, all terms will be used in the computations.

  • featureId (size_t) – [optional, default: -1] The feature index to compute part of gradient/hessian/proximal projection

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(data, dependentVariables, argument)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

Return type:

optimization_solver_objective_function_result

setup(data, dependentVariables, argument)

Setup (partial) input data for using algorithm object in other algorithms.

Parameters:
  • data (data_or_file) – Numeric table of size n x p with data

  • dependentVariables (data_or_file) – Numeric table of size n x 1 with dependent variables

  • argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

Return type:

None

daal4py.optimization_solver_cross_entropy_loss_result

alias of optimization_solver_objective_function_result

Sum of Functions

daal4py.optimization_solver_sum_of_functions_result

alias of optimization_solver_objective_function_result

Iterative Solvers

Stochastic Gradient Descent Algorithm

Parameters and semantics are described in oneAPI Data Analytics Library SGD.

Examples:

class daal4py.optimization_solver_sgd
Parameters:
  • function (optimization_solver_sum_of_functions_batch__iface__) – Objective function represented as sum of functions

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Stochastic gradient descent algorithm,

  • method (str) – [optional, default: “defaultDense”] Stochastic gradient descent computation method

  • batchIndices (array) – [optional, default: None] Numeric table that represents 32 bit integer indices of terms in the objective function. If no indices are provided, the implementation will generate random indices.

  • learningRateSequence (array) – [optional, default: None] Numeric table that contains values of the learning rate sequence

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random generation of 32 bit integer indices of terms in the objective function.

  • nIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the algorithm. The algorithm terminates when this accuracy is achieved

  • optionalResultRequired (bool) – [optional, default: False] Indicates whether optional result is required

  • batchSize (size_t) – [optional, default: -1] Number of batch indices to compute the stochastic gradient. If batchSize is equal to the number of terms in objective function then no random sampling is performed, and all terms are used to calculate the gradient. This parameter is ignored if batchIndices is provided.

  • conservativeSequence (array) – [optional, default: None] Numeric table of values of the conservative coefficient sequence

  • innerNIterations (size_t) – [optional, default: -1]

  • momentum (double) – [optional, default: get_nan64()] Momentum value

compute(inputArgument)

Do the actual computation on provided input data.

Parameters:

inputArgument (data_or_file) – Initial value to start optimization

Return type:

optimization_solver_sgd_result

class daal4py.optimization_solver_sgd_result

Properties:

minimum

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm

Parameters and semantics are described in oneAPI Data Analytics Library LBFGS.

Examples:

class daal4py.optimization_solver_lbfgs
Parameters:
  • function (optimization_solver_sum_of_functions_batch__iface__) – Objective function represented as sum of functions

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the LBFGS algorithm,

  • method (str) – [optional, default: “defaultDense”] LBFGS computation method

  • m (size_t) – [optional, default: -1] Memory parameter of LBFGS. The maximum number of correction pairs that define the approximation of inverse Hessian matrix.

  • L (size_t) – [optional, default: -1] The number of iterations between the curvature estimates calculations

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random choosing terms from objective function.

  • batchIndices (array) – [optional, default: None]

  • correctionPairBatchSize (size_t) – [optional, default: -1] Number of observations to compute the sub-sampled Hessian for correction pairs computation

  • correctionPairBatchIndices (array) – [optional, default: None]

  • stepLengthSequence (array) – [optional, default: None]

  • nIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the algorithm. The algorithm terminates when this accuracy is achieved

  • optionalResultRequired (bool) – [optional, default: False] Indicates whether optional result is required

  • batchSize (size_t) – [optional, default: -1] Number of batch indices to compute the stochastic gradient. If batchSize is equal to the number of terms in objective function then no random sampling is performed, and all terms are used to calculate the gradient. This parameter is ignored if batchIndices is provided.

compute(inputArgument)

Do the actual computation on provided input data.

Parameters:

inputArgument (data_or_file) – Initial value to start optimization

Return type:

optimization_solver_lbfgs_result

class daal4py.optimization_solver_lbfgs_result

Properties:

minimum

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

Adaptive Subgradient Method

Parameters and semantics are described in oneAPI Data Analytics Library AdaGrad.

Examples:

class daal4py.optimization_solver_adagrad
Parameters:
  • function (optimization_solver_sum_of_functions_batch__iface__) – Objective function represented as sum of functions

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Adaptive gradient descent algorithm,

  • method (str) – [optional, default: “defaultDense”] Adaptive gradient descent computation method

  • batchIndices (array) – [optional, default: None] Numeric table that represents 32 bit integer indices of terms in the objective function. If no indices are provided, the implementation will generate random indices.

  • learningRate (array) – [optional, default: None] Numeric table that contains value of the learning rate

  • degenerateCasesThreshold (double) – [optional, default: get_nan64()] Value needed to avoid degenerate cases in square root computing.

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random generation of 32 bit integer indices of terms in the objective function.

  • nIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the algorithm. The algorithm terminates when this accuracy is achieved

  • optionalResultRequired (bool) – [optional, default: False] Indicates whether optional result is required

  • batchSize (size_t) – [optional, default: -1] Number of batch indices to compute the stochastic gradient. If batchSize is equal to the number of terms in objective function then no random sampling is performed, and all terms are used to calculate the gradient. This parameter is ignored if batchIndices is provided.

compute(inputArgument)

Do the actual computation on provided input data.

Parameters:

inputArgument (data_or_file) – Initial value to start optimization

Return type:

optimization_solver_adagrad_result

class daal4py.optimization_solver_adagrad_result

Properties:

minimum

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

Stochastic Average Gradient Descent

Parameters and semantics are described in oneAPI Data Analytics Library Stochastic Average Gradient Descent SAGA.

Examples:

class daal4py.optimization_solver_saga
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Stochastic average gradient descent algorithm,

  • method (str) – [optional, default: “defaultDense”] Stochastic average gradient descent computation method

  • batchIndices (array) – [optional, default: None] Numeric table that represents 32 bit integer indices of terms in the objective function. If no indices are provided, the implementation will generate random indices.

  • learningRateSequence (array) – [optional, default: None] Numeric table that contains value of the learning rate

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random generation of 32 bit integer indices of terms in the objective function.

  • function (optimization_solver_sum_of_functions_batch__iface__) – [optional, default: None] Objective function represented as sum of functions

  • nIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the algorithm. The algorithm terminates when this accuracy is achieved

  • optionalResultRequired (bool) – [optional, default: False] Indicates whether optional result is required

  • batchSize (size_t) – [optional, default: -1] Number of batch indices to compute the stochastic gradient. If batchSize is equal to the number of terms in objective function then no random sampling is performed, and all terms are used to calculate the gradient. This parameter is ignored if batchIndices is provided.

compute(inputArgument, gradientsTable)

Do the actual computation on provided input data.

Parameters:
  • inputArgument (data_or_file) – Initial value to start optimization

  • gradientsTable (data_or_file) – Numeric table of size p x 1 with the values of G, where each value is an accumulated sum of squares of corresponding gradient’s coordinate values.

Return type:

optimization_solver_saga_result

class daal4py.optimization_solver_saga_result

Properties:

gradientsTable

Numpy array

Type:

type

minimum

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

Coordinate Descent

Parameters and semantics are described in oneAPI Data Analytics Library Coordinate Descent Algorithm.

class daal4py.optimization_solver_coordinate_descent
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Coordinate descent algorithm,

  • method (str) – [optional, default: “defaultDense”] Coordinate descent computation method

  • seed (size_t) – [optional, default: -1] Seed for random generation of 32 bit integer indices of terms in the objective function. DAAL_DEPRECATED_USE{ engine }

  • engine (engines_batchbase__iface__) – [optional, default: None] Engine for random generation of 32 bit integer indices of terms in the objective function.

  • selection (str) – [optional, default: “”]

  • positive (bool) – [optional, default: False]

  • skipTheFirstComponents (bool) – [optional, default: False]

  • function (optimization_solver_sum_of_functions_batch__iface__) – [optional, default: None] Objective function represented as sum of functions

  • nIterations (size_t) – [optional, default: -1] Maximal number of iterations of the algorithm

  • accuracyThreshold (double) – [optional, default: get_nan64()] Accuracy of the algorithm. The algorithm terminates when this accuracy is achieved

  • optionalResultRequired (bool) – [optional, default: False] Indicates whether optional result is required

  • batchSize (size_t) – [optional, default: -1] Number of batch indices to compute the stochastic gradient. If batchSize is equal to the number of terms in objective function then no random sampling is performed, and all terms are used to calculate the gradient. This parameter is ignored if batchIndices is provided.

compute(inputArgument)

Do the actual computation on provided input data.

Parameters:

inputArgument (data_or_file) – Initial value to start optimization

Return type:

optimization_solver_coordinate_descent_result

class daal4py.optimization_solver_coordinate_descent_result

Properties:

minimum

Numpy array

Type:

type

nIterations

Numpy array

Type:

type

Precomputed Function

Parameters and semantics are described in oneAPI Data Analytics Library Objective Function with Precomputed Characteristics.

class daal4py.optimization_solver_precomputed
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the objective function with precomputed characteristics, double or float

  • method (str) – [optional, default: “defaultDense”] The objective function with precomputed characteristics method

  • numberOfTerms (size_t) – [optional, default: -1] The number of terms in the function

  • batchIndices (array) – [optional, default: None] Numeric table of size 1 x m where m is batch size that represent a batch of indices used to compute the function results, e.g., value of the sum of the functions. If no indices are provided, all terms will be used in the computations.

  • featureId (size_t) – [optional, default: -1] The feature index to compute part of gradient/hessian/proximal projection

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

compute(argument)

Do the actual computation on provided input data.

Parameters:

argument (data_or_file) – Numeric table of size 1 x p with input argument of the objective function

Return type:

optimization_solver_objective_function_result

daal4py.optimization_solver_precomputed_result

alias of optimization_solver_objective_function_result

Recommender systems

Association Rules

Parameters and semantics are described in oneAPI Data Analytics Library Association Rules.

Examples:

class daal4py.association_rules
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the association rules algorithm, double or float

  • method (str) – [optional, default: “apriori”] Association rules algorithm computation method

  • minSupport (double) – [optional, default: get_nan64()] Minimum support 0.0 <= minSupport < 1.0

  • minConfidence (double) – [optional, default: get_nan64()] Minimum confidence 0.0 <= minConfidence < 1.0

  • nUniqueItems (size_t) – [optional, default: -1] Number of unique items

  • nTransactions (size_t) – [optional, default: -1] Number of transactions

  • discoverRules (bool) – [optional, default: False] Flag. If true, association rules are built from large itemsets

  • itemsetsOrder (str) – [optional, default: “”] Format of the resulting itemsets

  • rulesOrder (str) – [optional, default: “”] Format of the resulting association rules

  • minItemsetSize (size_t) – [optional, default: -1] Minimum number of items in a large itemset

  • maxItemsetSize (size_t) – [optional, default: -1] Maximum number of items in a large itemset. Set to zero to not limit the upper boundary for the size of large itemsets

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

association_rules_result

class daal4py.association_rules_result

Properties:

antecedentItemsets

Numpy array

Type:

type

confidence

Numpy array

Type:

type

consequentItemsets

Numpy array

Type:

type

largeItemsets

Numpy array

Type:

type

largeItemsetsSupport

Numpy array

Type:

type

Implicit Alternating Least Squares (implicit ALS)

Parameters and semantics are described in oneAPI Data Analytics Library Implicit Alternating Least Squares.

Examples:

class daal4py.implicit_als_training
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for implicit ALS model training, double or float

  • method (str) – [optional, default: “defaultDense”] Implicit ALS training method

  • nFactors (size_t) – [optional, default: -1] Number of factors

  • maxIterations (size_t) – [optional, default: -1] Maximum number of iterations of the implicit ALS training algorithm

  • alpha (double) – [optional, default: get_nan64()] Confidence parameter of the implicit ALS training algorithm

  • lambda (double) – [optional, default: get_nan64()] Regularization parameter

  • preferenceThreshold (double) – [optional, default: get_nan64()] Threshold used to define preference values

compute(data, inputModel)

Do the actual computation on provided input data.

Parameters:
  • data (data_or_file) – Input data table that contains ratings

  • inputModel (implicit_als_modelptr) – Initial model that contains initialized factors

Return type:

implicit_als_training_result

class daal4py.implicit_als_training_result

Properties:

model

implicit_als_model

Type:

type

class daal4py.implicit_als_model

Properties:

ItemsFactors

Numpy array

Type:

type

UsersFactors

Numpy array

Type:

type

class daal4py.implicit_als_prediction_ratings
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for implicit ALS model-based prediction, double or float

  • method (str) – [optional, default: “defaultDense”] Implicit ALS prediction method

  • nFactors (size_t) – [optional, default: -1] Number of factors

  • maxIterations (size_t) – [optional, default: -1] Maximum number of iterations of the implicit ALS training algorithm

  • alpha (double) – [optional, default: get_nan64()] Confidence parameter of the implicit ALS training algorithm

  • lambda (double) – [optional, default: get_nan64()] Regularization parameter

  • preferenceThreshold (double) – [optional, default: get_nan64()] Threshold used to define preference values

compute(model)

Do the actual computation on provided input data.

Parameters:

model (implicit_als_modelptr) – Input model trained by the ALS algorithm

Return type:

implicit_als_prediction_ratings_result

class daal4py.implicit_als_prediction_ratings_result

Properties:

prediction

Numpy array

Type:

type

Covariance, correlation, and distances

Cosine Distance Matrix

Parameters and semantics are described in oneAPI Data Analytics Library Cosine Distance.

Examples:

class daal4py.cosine_distance
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the cosine distance, double or float

  • method (str) – [optional, default: “defaultDense”] Cosine distance computation method

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

cosine_distance_result

class daal4py.cosine_distance_result

Properties:

cosineDistance

Numpy array

Type:

type

Correlation Distance Matrix

Parameters and semantics are described in oneAPI Data Analytics Library Correlation Distance.

Examples:

class daal4py.correlation_distance
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the correlation distance algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Correlation distance computation method

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

correlation_distance_result

class daal4py.correlation_distance_result

Properties:

correlationDistance

Numpy array

Type:

type

Correlation and Variance-Covariance Matrices

Parameters and semantics are described in oneAPI Data Analytics Library Correlation and Variance-Covariance Matrices.

Examples:

class daal4py.covariance
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of the correlation or variance-covariance matrix, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method

  • outputMatrixType (str) – [optional, default: “”] Type of the computed matrix

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

covariance_result

class daal4py.covariance_result

Properties:

correlation

Numpy array

Type:

type

covariance

Numpy array

Type:

type

mean

Numpy array

Type:

type

Data pre-processing

Normalization

Parameters and semantics are described in oneAPI Data Analytics Library Normalization.

Z-Score

Parameters and semantics are described in oneAPI Data Analytics Library Z-Score.

Examples:

class daal4py.normalization_zscore
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the z-score normalization, double or float

  • method (str) – [optional, default: “defaultDense”] Z-score normalization computation method

  • resultsToCompute (str) – [optional, default: “”] Type of results to compute or to evaluate. Can pass one of "computeClassLabels", "computeClassProbabilities", "computeClassLogProbabilities"; or more than one by joining them with separator bars (e.g. "computeClassLabels|computeClassProbabilities"). Note that not all of these are supported on every class/method accepting this argument (see docs for oneDAL for details on what this specific class/method supports).

  • doScale (bool) – [optional, default: False] boolean flag that indicates the mode of computation. If true both centering and scaling, otherwise only centering.

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

normalization_zscore_result

class daal4py.normalization_zscore_result

Properties:

means

Numpy array

Type:

type

normalizedData

Numpy array

Type:

type

variances

Numpy array

Type:

type

Min-Max

Parameters and semantics are described in oneAPI Data Analytics Library Min-Max.

Examples:

class daal4py.normalization_minmax
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the min-max normalization, double or float

  • method (str) – [optional, default: “defaultDense”] Min-max normalization computation method

  • lowerBound (double) – [optional, default: get_nan64()] The lower bound of the features value will be obtained during normalization.

  • upperBound (double) – [optional, default: get_nan64()] The upper bound of the features value will be obtained during normalization.

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

normalization_minmax_result

class daal4py.normalization_minmax_result

Properties:

normalizedData

Numpy array

Type:

type

Statistics

Moments of Low Order

Parameters and semantics are described in oneAPI Data Analytics Library Moments of Low Order.

Examples:

class daal4py.low_order_moments
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of the low order moments, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the algorithm

  • estimatesToCompute (str) – [optional, default: “”] Estimates to be computed by the algorithm

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

low_order_moments_result

class daal4py.low_order_moments_result

Properties:

maximum

Numpy array

Type:

type

mean

Numpy array

Type:

type

minimum

Numpy array

Type:

type

secondOrderRawMoment

Numpy array

Type:

type

standardDeviation

Numpy array

Type:

type

sum

Numpy array

Type:

type

sumSquares

Numpy array

Type:

type

sumSquaresCentered

Numpy array

Type:

type

variance

Numpy array

Type:

type

variation

Numpy array

Type:

type

Quantiles

Parameters and semantics are described in oneAPI Data Analytics Library Quantiles.

Examples:

class daal4py.quantiles
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the quantile algorithms, double or float

  • method (str) – [optional, default: “defaultDense”] Quantiles computation method

  • quantileOrders (array) – [optional, default: None] Numeric table with quantile orders. Default value is 0.5 (median)

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

quantiles_result

class daal4py.quantiles_result

Properties:

quantiles

Numpy array

Type:

type

Linear algebra

Cholesky Decomposition

Parameters and semantics are described in oneAPI Data Analytics Library Cholesky Decomposition.

Examples:

class daal4py.cholesky
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the Cholesky decomposition algorithm,

  • method (str) – [optional, default: “defaultDense”] Cholesky decomposition computation method

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

cholesky_result

class daal4py.cholesky_result

Properties:

choleskyFactor

Numpy array

Type:

type

QR Decomposition

Parameters and semantics are described in oneAPI Data Analytics Library QR Decomposition.

QR Decomposition (without pivoting)

Parameters and semantics are described in oneAPI Data Analytics Library QR Decomposition without pivoting.

Examples:

class daal4py.qr
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the QR decomposition algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the algorithm

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

qr_result

class daal4py.qr_result

Properties:

matrixQ

Numpy array

Type:

type

matrixR

Numpy array

Type:

type

Pivoted QR Decomposition

Parameters and semantics are described in oneAPI Data Analytics Library Pivoted QR Decomposition.

Examples:

class daal4py.pivoted_qr
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of the pivoted QR algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method

  • permutedColumns (array) – [optional, default: None] On entry, if i-th element of permutedColumns != 0, * the i-th column of input matrix is moved to the beginning of Data * P before * the computation, and fixed in place during the computation. * If i-th element of permutedColumns = 0, the i-th column of input data * is a free column (that is, it may be interchanged during the * computation with any other free column).

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

pivoted_qr_result

class daal4py.pivoted_qr_result

Properties:

matrixQ

Numpy array

Type:

type

matrixR

Numpy array

Type:

type

permutationMatrix

Numpy array

Type:

type

Singular Value Decomposition (SVD)

Parameters and semantics are described in oneAPI Data Analytics Library SVD.

Examples:

class daal4py.svd
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the SVD algorithm, double or float

  • method (str) – [optional, default: “defaultDense”] SVD computation method

  • leftSingularMatrix (str) – [optional, default: “”] Format of the matrix of left singular vectors >

  • rightSingularMatrix (str) – [optional, default: “”] Format of the matrix of right singular vectors >

  • distributed (bool) – [optional, default: False] enable distributed computation (SPMD)

  • streaming (bool) – [optional, default: False] enable streaming

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

svd_result

class daal4py.svd_result

Properties:

leftSingularMatrix

Numpy array

Type:

type

rightSingularMatrix

Numpy array

Type:

type

singularValues

Numpy array

Type:

type

Random number generation

Random Number Engines

Parameters and semantics are described in oneAPI Data Analytics Library Engines.

class daal4py.engines_result

Properties:

randomNumbers

Numpy array

Type:

type

mt19937

Parameters and semantics are described in oneAPI Data Analytics Library mt19937.

class daal4py.engines_mt19937
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of mt19937 engine, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the engine

  • seed (size_t) – [optional, default: -1] seed

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

engines_result

daal4py.engines_mt19937_result

alias of engines_result

mt2203

Parameters and semantics are described in oneAPI Data Analytics Library mt2203.

class daal4py.engines_mt2203
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of mt2203 engine, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the engine

  • seed (size_t) – [optional, default: -1] seed

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

engines_result

daal4py.engines_mt2203_result

alias of engines_result

mcg59

Parameters and semantics are described in oneAPI Data Analytics Library mcg59.

class daal4py.engines_mcg59
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of mcg59 engine, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the engine

  • seed (size_t) – [optional, default: -1] seed

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

engines_result

daal4py.engines_mcg59_result

alias of engines_result

mrg32k3a

class daal4py.engines_mrg32k3a
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of mrg32k3a engine, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the engine

  • seed (size_t) – [optional, default: -1] seed

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

engines_result

daal4py.engines_mrg32k3a_result

alias of engines_result

philox4x32x10

class daal4py.engines_philox4x32x10
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of philox4x32x10 engine, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the engine

  • seed (size_t) – [optional, default: -1] seed

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

engines_result

daal4py.engines_philox4x32x10_result

alias of engines_result

Distributions

Parameters and semantics are described in oneAPI Data Analytics Library Distributions.

Bernoulli

Parameters and semantics are described in oneAPI Data Analytics Library Bernoulli Distribution.

Examples:

class daal4py.distributions_bernoulli
Parameters:
  • p (double) – Success probability of a trial, value from [0.0; 1.0]

  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of bernoulli distribution, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the distribution

  • engine (engines_batchbase__iface__) – [optional, default: None] Pointer to the engine

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

distributions_result

daal4py.distributions_bernoulli_result

alias of distributions_result

Normal

Parameters and semantics are described in oneAPI Data Analytics Library Normal Distribution.

Examples:

class daal4py.distributions_normal
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of normal distribution, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the distribution

  • a (double) – [optional, default: get_nan64()] Mean

  • sigma (double) – [optional, default: get_nan64()] Standard deviation

  • engine (engines_batchbase__iface__) – [optional, default: None] Pointer to the engine

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

distributions_result

daal4py.distributions_normal_result

alias of distributions_result

Uniform

Parameters and semantics are described in oneAPI Data Analytics Library Uniform Distribution.

Examples:

class daal4py.distributions_uniform
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations of uniform distribution, double or float

  • method (str) – [optional, default: “defaultDense”] Computation method of the distribution

  • a (double) – [optional, default: get_nan64()] Left bound a

  • b (double) – [optional, default: get_nan64()] Right bound b

  • engine (engines_batchbase__iface__) – [optional, default: None] Pointer to the engine

compute(tableToFill)

Do the actual computation on provided input data.

Parameters:

tableToFill (data_or_file) – Input table to fill with random numbers

Return type:

distributions_result

daal4py.distributions_uniform_result

alias of distributions_result

Sorting

Parameters and semantics are described in oneAPI Data Analytics Library Sorting.

Examples:

class daal4py.sorting
Parameters:
  • fptype (str) – [optional, default: “double”] Data type to use in intermediate computations for the sorting, double or float

  • method (str) – [optional, default: “defaultDense”] Sorting computation method

compute(data)

Do the actual computation on provided input data.

Parameters:

data (data_or_file) – Input data table

Return type:

sorting_result

class daal4py.sorting_result

Properties:

sortedData

Numpy array

Type:

type