Reduction Fusion Patterns

Reduction Fusion Patterns#

Overview#

The Reduction category includes operations such as: ReduceL1, ReduceL2, ReduceMax, ReduceMean, ReduceMin, ReduceProd, ReduceSum.

oneDNN supports various reduction fusion patterns to optimize performance and reduce memory bandwidth requirements. This document describes the supported fusion patterns for Reduction.

Pattern Structure#

oneDNN defines floating-point Reduction fusion patterns as follows. The blue nodes are required when defining a Reduction fusion pattern while the brown nodes are optional.

Reduction pattern
  1. Reduction Operation : Performs the corresponding reduction operation for the src tensor. See the ReduceL1, ReduceL2, ReduceMax, ReduceMean, ReduceMin, ReduceProd and ReduceSum operations in the Graph API for more details.

  2. Epilogue Subgraph : Optional and can include the following operations:

    Combination Rules:

    epilogue subgraph
    • N=20, 0 to 20 Binary or Unary operations are supported in the epilogue subgraph.

Data Types#

oneDNN supports the following combinations of data types for src and dst:

src

dst

f32,bf16,f16

f32,bf16,f16

The definition of the data types and support status on different CPU and GPU platforms follow the general description in the Data Types Guide.

Implementation Notes#

Post-binary Add operations in the epilogue subgraph support in-place operations when the post-binary Add is the last operation in the epilogue subgraph and the dst output shape is identical and data type size is the same as the binary Add input. In case of an in-place operation, the original input data will be overwritten. Use in-place operations whenever possible for performance.