Norm Fusion Patterns

Overview

The Norm category for inference includes operations such as: GroupNorm, LayerNorm and BatchNormInference.

oneDNN supports various Norm fusion patterns to optimize performance and reduce memory bandwidth requirements. This document describes the supported fusion patterns for Norm.

Pattern Structure

oneDNN defines floating-point Norm fusion patterns as follows. The blue nodes are required when defining a Norm fusion pattern while the brown nodes are optional.

Norm pattern
  1. Norm Operation : Performs the corresponding norm operation for the src tensor. See the GroupNorm, LayerNorm, BatchNormInference operations in the Graph API for more details.

  2. F2F Conversion Subgraph : Converts the output tensor from floating-point to another floating-point. It is constructed by a TypeCast operation.

    f2f_conversion_subgraph
  3. Epilogue Subgraph : Optional and can include the following operations:

    Combination Rules:

    epilogue subgraph
    • 0 to 4 Binary or Unary operations are supported in the epilogue subgraph.

  4. F2Q Conversion Subgraph : Converts the output tensor from floating-point to quantized data type. It can be one of the following subgraphs. It is constructed by a Quantize operation.

    f2q_conversion_subgraph

Data Types

oneDNN supports the following combinations of data types for src and dst:

src

dst

bf16,f16,f32

u8,s8,bf16,f16,f32

The definition of data types and their support status on different CPU and GPU platforms follow the general description in the Data Types Guide.

Implementation Limitations

  1. BatchNormInference:

    1. The Epilogue Subgraph only supports ReLU, and if present, can only appear once.

    2. F2F and F2Q Conversion Subgraphs are not supported.