.. ****************************************************************************** .. * Copyright 2020 Intel Corporation .. * .. * Licensed under the Apache License, Version 2.0 (the "License"); .. * you may not use this file except in compliance with the License. .. * You may obtain a copy of the License at .. * .. * http://www.apache.org/licenses/LICENSE-2.0 .. * .. * Unless required by applicable law or agreed to in writing, software .. * distributed under the License is distributed on an "AS IS" BASIS, .. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. * See the License for the specific language governing permissions and .. * limitations under the License. .. *******************************************************************************/ AdaBoost Classifier =================== AdaBoost (short for "Adaptive Boosting") is a popular boosting classification algorithm. AdaBoost algorithm performs well on a variety of data sets except some noisy data [Freund99]_. AdaBoost is a binary classifier. For a multi-class case, use :ref:`svm_multi_class` framework of the library. Details ******* Given :math:`n` feature vectors :math:`x_1 = (x_{11}, \ldots, x_{1p}), \ldots, x_n = (x_{n1}, \ldots, x_{np})` of size :math:`p` and a vector of class labels :math:`y= (y_1, \ldots, y_n)`, where :math:`y_i \in K = \{-1, 1\}` describes the class to which the feature vector :math:`x_i` belongs, and a weak learner algorithm, the problem is to build an AdaBoost classifier. Training Stage -------------- The following scheme shows the major steps of the algorithm: #. Initialize weights :math:`D_1(i) = \frac{1}{n}` for :math:`i = 1, \ldots, n`. #. For :math:`t = 1, \ldots, T`: #. Train the weak learner :math:`h_t(t) \in \{-1, 1\}` using weights :math:`D_t.` #. Choose a confidence value :math:`\alpha_t`. #. Update :math:`D_{t+1}(i) = \frac {D_t(i)\exp(-\alpha_t Y_i h_t(x_i))} {Z_t}`, where :math:`Z_t` is a normalization factor. #. Output the final hypothesis: .. math:: H(x_i) = \mathrm{sign} \left( \sum _{t=1}^{T} \alpha_t h_t(x_i)\right) Prediction Stage ---------------- Given the AdaBoost classifier and :math:`r` feature vectors :math:`x_1, \ldots, x_r`, the problem is to calculate the final class: .. math:: H(x_i) = \mathrm{sign} \left( \sum _{t=1}^{T} \alpha_t h_t(x_i)\right) Batch Processing **************** AdaBoost classifier follows the general workflow described in :ref:`classification_usage_model`. Training -------- For a description of the input and output, refer to :ref:`classification_usage_model`. At the training stage, an AdaBoost classifier has the following parameters: .. tabularcolumns:: |\Y{0.2}|\Y{0.2}|\Y{0.6}| .. list-table:: Training Parameters for AdaBoost Classifier (Batch Processing) :header-rows: 1 :widths: 10 20 30 :align: left :class: longtable * - Parameter - Default Value - Description * - ``algorithmFPType`` - ``float`` - The floating-point type that the algorithm uses for intermediate computations. Can be ``float`` or ``double``. * - ``method`` - ``defaultDense`` - The computation method used by the AdaBoost classifier. The only training method supported so far is the Y. Freund's method. * - ``weakLearnerTraining`` - Pointer to an object of the stump training class - Pointer to the training algorithm of the weak learner. By default, a stump weak learner is used. * - ``weakLearnerPrediction`` - Pointer to an object of the stump prediction class - Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. * - ``accuracyThreshold`` - :math:`0.01` - AdaBoost training accuracy. * - ``maxIterations`` - :math:`100` - The maximal number of iterations for the algorithm. Prediction ---------- For a description of the input and output, refer to :ref:`classification_usage_model`. At the prediction stage, an AdaBoost classifier has the following parameters: .. tabularcolumns:: |\Y{0.2}|\Y{0.2}|\Y{0.6}| .. list-table:: Prediction Parameters for AdaBoost Classifier (Batch Processing) :header-rows: 1 :widths: 10 20 30 :align: left :class: longtable * - Parameter - Default Value - Description * - ``algorithmFPType`` - ``float`` - The floating-point type that the algorithm uses for intermediate computations. Can be ``float`` or ``double``. * - ``method`` - ``defaultDense`` - Performance-oriented computation method, the only method supported by the AdaBoost classifier at the prediction stage. * - ``weakLearnerPrediction`` - Pointer to an object of the stump prediction class - Pointer to the prediction algorithm of the weak learner. By default, a stump weak learner is used. Examples ******** .. tabs:: .. tab:: C++ (CPU) Batch Processing: - :cpp_example:`adaboost_dense_batch.cpp ` .. tab:: Python* - :daal4py_example:`adaboost.py`