GELU

General

GELU operation applies following formula on every element of \(\src\) tensor (the variable names follow the standard Naming Conventions):

\[dst = 0.5 * src * (1.0 + erf(src) / \sqrt2)\]

When the attribute mode is specified as gelu_tanh, an approximation implementation is used:

\[dst = 0.5 * src * (1.0 + \tanh[\sqrt{\frac{2}{\pi}} (src + 0.044715 * s^3)])\]

Operation attributes

Attribute Name

Description

Value Type

Supported Values

Required or Optional

mode

Specifies the computation mode of GELU.

string

gelu_erf (default), gelu_tanh

Optional

Execution arguments

The inputs and outputs must be provided according to below index order when constructing an operation.

Inputs

Index

Argument Name

Required or Optional

0

src

Required

Outputs

Index

Argument Name

Required or Optional

0

dst

Required

Supported data types

GELU operation supports the following data type combinations.

Src

Dst

f32

f32

f16

f16

bf16

bf16