SoftMaxBackward#

General#

SoftMaxBackward operation computes gradient for SoftMax.

Operation attributes#

Attribute Name

Description

Value Type

Supported Values

Required or Optional

axis

Represents the axis from which the SoftMax is calculated.

s64

Arbitrary s64 value ( 1 in default)

Optional

Execution arguments#

The inputs and outputs must be provided according to below index order when constructing an operation.

Inputs#

Index

Argument Name

Required or Optional

0

diff_dst

Required

1

dst

Required

Outputs#

Index

Argument Name

Required or Optional

0

diff_src

Required

Supported data types#

SoftMaxBackward operation supports the following data type combinations.

Dst

Diff_dst

Diff_src

f32

f32

f32

bf16

bf16

f32, bf16

f16

f16

f32, f16

Implementation Notes#

SoftMaxBackward supports in-place operations, meaning that diff_dst can be used as both input and output (diff_src). In case of in-place operation, the original diff_dst data will be overwritten. This support is limited to cases when data types of diff_dst and diff_src are identical. Use in-place operations whenever possible for performance.