SoftMax#
General#
SoftMax operation applies the following formula on every element of \(\src\) tensor (the variable names follow the standard Naming Conventions):
where \(C\) is a size of tensor along axis dimension. Subtracting the maximum value along the axis improves numerical stability.
If the optional stats output is requested, it is defined as:
Operation attributes#
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
|---|---|---|---|---|
Represents the axis from which the SoftMax is calculated. |
s64 |
Arbitrary s64 value ( |
Optional |
|
Specifies the computation mode of SoftMax |
string |
|
Optional |
When the operation attribute mode is not set or set to none, the operation performs the normal SoftMax calculation. In this case, the operation will generate NaN if all the input elements are -infinity along the axis dimension. To prevent this, you can set the attribute to inf_as_zero so that the operation generates zeros for -infinity inputs.
Execution arguments#
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs#
Index |
Argument Name |
Required or Optional |
|---|---|---|
0 |
|
Required |
Outputs#
Index |
Argument Name |
Required or Optional |
|---|---|---|
0 |
|
Required |
1 |
|
Optional |
Supported data types#
SoftMax operation supports the following data type combinations.
Src |
Dst |
Stats |
|---|---|---|
f32 |
f32, bf16, f16 |
f32 |
bf16 |
bf16 |
f32 |
f16 |
f16 |
f32 |