Conventions#
oneDNN specification relies on a set of standard naming conventions for variables. This section describes these conventions.
Variable (Tensor) Names#
Neural network models consist of operations of the following form:
where \(\dst\) and \(\src\) are activation tensors, and \(\weights\) are learnable tensors.
The backward propagation therefore consists in computing the gradients with respect to the \(\src`and :math:\)weights` respectively:
and
While oneDNN uses src, dst, and weights as generic names for the activations and learnable tensors, for a specific operation there might be commonly used and widely known specific names for these tensors. For instance, the convolution operation has a learnable tensor called bias. For usability reasons, oneDNN primitives use such names in initialization and other functions.
oneDNN uses the following commonly used notations for tensors:
Name  | 
Meaning  | 
|---|---|
  | 
Source tensor  | 
  | 
Destination tensor  | 
  | 
Weights tensor  | 
  | 
Bias tensor (used in convolution, inner product and other primitives)  | 
  | 
Scale and shift tensors (used in Batch Normalization and Layer normalization primitives)  | 
  | 
Workspace tensor that carries additional information from the forward propagation to the backward propagation  | 
  | 
Temporary tensor that is required to store the intermediate results  | 
  | 
Gradient tensor with respect to the source  | 
  | 
Gradient tensor with respect to the destination  | 
  | 
Gradient tensor with respect to the weights  | 
  | 
Gradient tensor with respect to the bias  | 
  | 
Gradient tensor with respect to the scale  | 
  | 
Gradient tensor with respect to the shift  | 
  | 
RNN layer data or weights tensors  | 
  | 
RNN recurrent data or weights tensors  | 
RNN-Specific Notation#
The following notations are used when describing RNN primitives.
Name  | 
Semantics  | 
|---|---|
\(\cdot\)  | 
matrix multiply operator  | 
\(*\)  | 
elementwise multiplication operator  | 
W  | 
input weights  | 
U  | 
recurrent weights  | 
\(\Box^T\)  | 
transposition  | 
B  | 
bias  | 
h  | 
hidden state  | 
a  | 
intermediate value  | 
x  | 
input  | 
\(\Box_t\)  | 
timestamp index  | 
\(\Box_l\)  | 
layer index  | 
activation  | 
tanh, relu, logistic  | 
c  | 
cell state  | 
\(\tilde{c}\)  | 
candidate state  | 
i  | 
input gate  | 
f  | 
forget gate  | 
o  | 
output gate  | 
u  | 
update gate  | 
r  | 
reset gate  |