<Untitled>#
C++ API example demonstrating how one can use MatMul with compressed weights.
C++ API example demonstrating how one can use MatMul with compressed weights.
Concepts:
Asymmetric quantization
Zero points: dnnl::primitive_attr::set_zero_points()
Create primitive once, use multiple times
Weights pre-packing: use dnnl::memory::format_tag::any