Convolution Fusion Patterns¶
Overview¶
oneDNN supports both floating-point and quantized Convolution fusion patterns to optimize performance and reduce memory bandwidth requirements. This document describes the supported floating-point fusion patterns for Convolution. For quantized Convolution fusion patterns, refer to Quantized Convolution Fusion Patterns for more details.
Pattern Structure¶
oneDNN defines floating-point Convolution fusion patterns as follows. The blue nodes are required when defining a Convolution fusion pattern while the brown nodes are optional.

Convolution Operation : Performs convolution between the
src
andweights
tensors. Thebias
tensor is optional. See the Convolution operation in the Graph API for more details.Epilogue Subgraph : Optional and can include the following operations:
BiasAdd operation.
BatchNormInference operation.
Convolution operation.
Binary and Unary operations: refer to the Note in Fusion Patterns.
Combination Rules:
BiasAdd : If present, must be the first op in the epilogue subgraph and can only appear once.
BatchNormInference : If present, must precede Binary or Unary operations and can only appear once.
Convolution : If present, is a Depthwise Convolution which can only be fused with 1x1 Convolution and can only appear once.
0 to 4 Binary or Unary operations are supported in the epilogue subgraph.
F2F Conversion Subgraph : Converts the output tensor from floating-point to another floating-point. It is constructed by a TypeCast operation.
Data Types¶
oneDNN supports the following combinations of data types for src, weights, bias and dst:
src |
weights |
bias |
dst |
---|---|---|---|
f32,bf16,f16 |
f32,bf16,f16 |
f32,bf16,f16 |
f32,bf16,f16 |
The definition of the data types and support status on different CPU and GPU platforms follow the general description in the Data Types Guide.
Implementation Limitations¶
Convolution as a post op (Depthwise Convolution) is not supported on GPU.
Convolution and BatchNormInference cannot co-exist in the epilogue subgraph.
F2F Conversion Subgraph used for
dst
tensor only supports bf16 to f32 data type conversion.
Example¶
oneDNN provides a CPU Convolution example and a GPU Convolution example demonstrating how to construct a typical floating-point Convolution pattern with oneDNN Graph API on CPU and GPU.