.. ****************************************************************************** .. * Copyright 2020 Intel Corporation .. * .. * Licensed under the Apache License, Version 2.0 (the "License"); .. * you may not use this file except in compliance with the License. .. * You may obtain a copy of the License at .. * .. * http://www.apache.org/licenses/LICENSE-2.0 .. * .. * Unless required by applicable law or agreed to in writing, software .. * distributed under the License is distributed on an "AS IS" BASIS, .. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. * See the License for the specific language governing permissions and .. * limitations under the License. .. *******************************************************************************/ Batch Processing ================ Algorithm Parameters ******************** The DBSCAN clustering algorithm has the following parameters: .. tabularcolumns:: |\Y{0.15}|\Y{0.15}|\Y{0.7}| .. list-table:: Algorithm Parameters for DBSCAN (Batch Processing) :widths: 10 10 60 :header-rows: 1 :class: longtable * - Parameter - Default Valude - Description * - ``algorithmFPType`` - ``float`` - The floating-point type that the algorithm uses for intermediate computations. Can be ``float`` or ``double``. * - ``method`` - ``defaultDense`` - Available methods for computation of DBSCAN algorithm: - ``defaultDense`` – uses brute-force for neighborhood computation * - ``epsilon`` - Not applicable - The maximum distance between observations lying in the same neighborhood. * - ``minObservations`` - Not applicable - The number of observations in a neighborhood for an observation to be considered as a :term:`core ` one. * - ``memorySavingMode`` - ``false`` - If flag is set to false, all neighborhoods will be computed and stored prior to clustering. It will require up to :math:`O(|\text{sum of sizes of all observations' neighborhoods}|)` of additional memory, which in worst case can be :math:`O(|\text{number of observations}|^2)`. However, in general, performance may be better. .. note:: On GPU, the ``memorySavingMode`` flag can only be set to ``true``. You will get an error if the flag is set to ``false``. * - ``resultsToCompute`` - :math:`0` - The 64-bit integer flag that specifies which extra characteristics of the DBSCAN algorithm to compute. Provide one of the following values to request a single characteristic or use bitwise OR to request a combination of the characteristics: - ``computeCoreIndices`` for indices of core observations - ``computeCoreObservations`` for core observations Algorithm Input *************** The DBSCAN algorithm accepts the input described below. Pass the ``Input ID`` as a parameter to the methods that provide input for your algorithm. For more details, see :ref:`algorithms`. .. tabularcolumns:: |\Y{0.2}|\Y{0.8}| .. list-table:: Algorithm Input for DBSCAN (Batch Processing) :widths: 10 60 :header-rows: 1 :class: longtable * - Input ID - Input * - ``data`` - Pointer to the :math:`n \times p` numeric table with the data to be clustered. .. note:: The input can be an object of any class derived from ``NumericTable``. * - ``weights`` - Optional input. Pointer to the :math:`n \times 1` numeric table with weights of observations. .. note:: The input can be an object of any class derived from ``NumericTable`` except ``PackedTriangularMatrix``, ``PackedSymmetricMatrix``. By default all weights are equal to :math:`1`. .. note:: This parameter is ignored on GPU. Algorithm Output **************** The DBSCAN algorithms calculates the results described below. Pass the ``Result ID`` as a parameter to the methods that access the result of your algorithm. For more details, see :ref:`algorithms`. .. tabularcolumns:: |\Y{0.2}|\Y{0.8}| .. list-table:: Algorithm Output for DBSCAN (Batch Processing) :widths: 10 60 :header-rows: 1 :class: longtable * - Result ID - Result * - ``assignments`` - Pointer to the :math:`n \times 1` numeric table with assignments of cluster indices to observations in the input data. :term:`Noise observations ` have the assignment equal to :math:`-1`. * - ``nClusters`` - Pointer to the :math:`1 \times 1` numeric table with the total number of clusters found by the algorithm. * - ``coreIndices`` - Pointer to the numeric table with :math:`1` column and arbitrary number of rows, containing indices of core observations. * - ``coreObservations`` - Pointer to the numeric table with :math:`p` columns and arbitrary number of rows, containing core observations. .. note:: By default, this result is an object of the ``HomogenNumericTable`` class, but you can define the result as an object of any class derived from ``NumericTable`` except ``PackedTriangularMatrix``, ``PackedSymmetricMatrix``, and ``CSRNumericTable``.