Norm Fusion Patterns#

Overview#

The Norm category for inference includes operations such as: GroupNorm, LayerNorm and BatchNormInference.

oneDNN supports various Norm fusion patterns to optimize performance and reduce memory bandwidth requirements. This document describes the supported fusion patterns for Norm.

Pattern Structure#

oneDNN defines floating-point Norm fusion patterns as follows. The blue nodes are required when defining a Norm fusion pattern while the brown nodes are optional.

Norm Operation : Performs the corresponding norm operation for the src tensor. See the GroupNorm, LayerNorm, BatchNormInference operations in the Graph API for more details.
F2F Conversion Subgraph : Converts the output tensor from floating-point to another floating-point. It is constructed by a TypeCast operation.
Epilogue Subgraph : Optional and can include the following operations:
- Binary and Unary operations: refer to the Note in Fusion Patterns.
Combination Rules:
- 0 to 4 Binary or Unary operations are supported in the epilogue subgraph.
F2Q Conversion Subgraph : Converts the output tensor from floating-point to quantized data type. It can be one of the following subgraphs. It is constructed by a Quantize operation.

Data Types#

oneDNN supports the following combinations of data types for src and dst:

src	dst
bf16,f16,f32	u8,s8,bf16,f16,f32

The definition of data types and their support status on different CPU and GPU platforms follow the general description in the Data Types Guide.

Implementation Limitations#

BatchNormInference:
1. The Epilogue Subgraph only supports ReLU, and if present, can only appear once.
2. F2F and F2Q Conversion Subgraphs are not supported.

Norm Fusion Patterns

Contents

Norm Fusion Patterns#

Overview#

Pattern Structure#

Data Types#

Implementation Limitations#