Reduction#

General#

The reduction primitive performs reduction operation on arbitrary data. Each element in the destination is the result of reduction operation with specified algorithm along one or multiple source tensor dimensions:

\[\dst(f) = \mathop{reduce\_op}\limits_{r}\src(r),\]

where \(reduce\_op\) can be max, min, sum, mul, mean, Lp-norm and Lp-norm-power-p, \(f\) is an index in an idle dimension and \(r\) is an index in a reduction dimension.

Mean:

\[\dst(f) = \frac{\sum\limits_{r}\src(r)} {R},\]

where \(R\) is the size of a reduction dimension.

Lp-norm:

\[\dst(f) = \root p \of {\mathop{eps\_op}(\sum\limits_{r}|src(r)|^p, eps)},\]

where \(eps\_op\) can be max and sum.

Lp-norm-power-p:

\[\dst(f) = \mathop{eps\_op}(\sum\limits_{r}|src(r)|^p, eps),\]

where \(eps\_op\) can be max and sum.

Notes#

The reduction primitive requires the source and destination tensors to have the same number of dimensions.
Reduction dimensions are of size 1 in a destination tensor.
The reduction primitive does not have a notion of forward or backward propagations.

Execution Arguments#

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.

Primitive input/output	Execution argument index
\(\src\)	DNNL_ARG_SRC
\(\dst\)	DNNL_ARG_DST
\(\text{binary post-op}\)	DNNL_ARG_ATTR_MULTIPLE_POST_OP(binary_post_op_position) \| DNNL_ARG_SRC_1

Implementation Details#

General Notes#

The \(\dst\) memory format can be either specified explicitly or by dnnl::memory::format_tag::any (recommended), in which case the primitive will derive the most appropriate memory format based on the format of the source tensor.

Post-Ops and Attributes#

The following attributes are supported:

Type	Operation	Description	Restrictions
Post-op	Sum	Adds the operation result to the destination tensor instead of overwriting it.
Post-op	Eltwise	Applies an Eltwise operation to the result.
Post-op	Binary	Applies a Binary operation to the result	General binary post-op restrictions

Data Types Support#

The source and destination tensors may have f32, bf16, f16 or int8 data types. See Data Types page for more details.

Data Representation#

Sources, Destination#

The reduction primitive works with arbitrary data tensors. There is no special meaning associated with any of the dimensions of a tensor.

Implementation Limitations#

Refer to Data Types for limitations related to data types support.
GPU
- Only tensors of 6 or fewer dimensions are supported.

Performance Tips#

Whenever possible, avoid specifying different memory formats for source and destination tensors.

Example#

Reduction Primitive Example

This C++ API example demonstrates how to create and execute a Reduction primitive.

Reduction

Contents