Reduction#
General#
The reduction primitive performs reduction operation on arbitrary data. Each element in the destination is the result of reduction operation with specified algorithm along one or multiple source tensor dimensions:
where \(reduce\_op\) can be max, min, sum, mul, mean, Lp-norm and Lp-norm-power-p, \(f\) is an index in an idle dimension and \(r\) is an index in a reduction dimension.
Mean:
where \(R\) is the size of a reduction dimension.
Lp-norm:
where \(eps\_op\) can be max and sum.
Lp-norm-power-p:
where \(eps\_op\) can be max and sum.
Notes#
The reduction primitive requires the source and destination tensors to have the same number of dimensions.
Reduction dimensions are of size 1 in a destination tensor.
The reduction primitive does not have a notion of forward or backward propagations.
Execution Arguments#
When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.
Primitive input/output |
Execution argument index |
---|---|
\(\src\) |
DNNL_ARG_SRC |
\(\dst\) |
DNNL_ARG_DST |
\(\text{binary post-op}\) |
DNNL_ARG_ATTR_MULTIPLE_POST_OP(binary_post_op_position) | DNNL_ARG_SRC_1 |
Implementation Details#
General Notes#
The \(\dst\) memory format can be either specified explicitly or by dnnl::memory::format_tag::any (recommended), in which case the primitive will derive the most appropriate memory format based on the format of the source tensor.
Post-Ops and Attributes#
The following attributes are supported:
Data Types Support#
The source and destination tensors may have f32
, bf16
, f16
or int8
data types. See Data Types page for more details.
Data Representation#
Sources, Destination#
The reduction primitive works with arbitrary data tensors. There is no special meaning associated with any of the dimensions of a tensor.
Implementation Limitations#
Refer to Data Types for limitations related to data types support.
GPU
Only tensors of 6 or fewer dimensions are supported.
Performance Tips#
Whenever possible, avoid specifying different memory formats for source and destination tensors.
Example#
This C++ API example demonstrates how to create and execute a Reduction primitive.