<Untitled>#
C++ API example demonstrating how one can use MatMul fused with ReLU in INT8 inference.
C++ API example demonstrating how one can use MatMul fused with ReLU in INT8 inference.
Concepts:
Asymmetric quantization
Zero points: dnnl::primitive_attr::set_zero_points_mask()
Create primitive once, use multiple times
Run-time tensor shapes: DNNL_RUNTIME_DIM_VAL
Weights pre-packing: use dnnl::memory::format_tag::any