.. index:: pair: page; Getting started on CPU with Graph API .. _doxid-graph_cpu_getting_started_cpp: Getting started on CPU with Graph API ===================================== This is an example to demonstrate how to build a simple graph and run it on CPU. This is an example to demonstrate how to build a simple graph and run it on CPU. Example code: :ref:`cpu_getting_started.cpp ` Some key take-aways included in this example: * how to build a graph and get partitions from it * how to create an engine, allocator and stream * how to compile a partition * how to execute a compiled partition Some assumptions in this example: * Only workflow is demonstrated without checking correctness * Unsupported partitions should be handled by users themselves .. _doxid-graph_cpu_getting_started_cpp_1graph_cpu_getting_started_cpp_headers: Public headers ~~~~~~~~~~~~~~ To start using oneDNN Graph, we must include the ``dnnl_graph.hpp`` header file in the application. All the C++ APIs reside in namespace ``:ref:`dnnl::graph ```. .. ref-code-block:: cpp #include #include #include #include #include #include #include "oneapi/dnnl/dnnl_graph.hpp" #include "example_utils.hpp" #include "graph_example_utils.hpp" using namespace :ref:`dnnl::graph `; using :ref:`data_type ` = :ref:`logical_tensor::data_type `; using :ref:`layout_type ` = :ref:`logical_tensor::layout_type `; using dim = :ref:`logical_tensor::dim `; using dims = :ref:`logical_tensor::dims `; .. _doxid-graph_cpu_getting_started_cpp_1graph_cpu_getting_started_cpp_tutorial: cpu_getting_started_tutorial() function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. _doxid-graph_cpu_getting_started_cpp_1graph_cpu_getting_started_cpp_get_partition: Build Graph and Get Partitions ------------------------------ In this section, we are trying to build a graph containing the pattern ``conv0->relu0->conv1->relu1``. After that, we can get all of partitions which are determined by backend. To build a graph, the connection relationship of different ops must be known. In oneDNN Graph, :ref:`dnnl::graph::logical_tensor ` is used to express such relationship. So, next step is to create logical tensors for these ops including inputs and outputs. .. note:: It's not necessary to provide concrete shape/layout information at graph partitioning stage. Users can provide these information till compilation stage. Create input/output :ref:`dnnl::graph::logical_tensor ` for first ``Convolution`` op. .. ref-code-block:: cpp logical_tensor conv0_src_desc {0, data_type::f32}; logical_tensor conv0_weight_desc {1, data_type::f32}; logical_tensor conv0_dst_desc {2, data_type::f32}; Create first ``Convolution`` op (:ref:`dnnl::graph::op `) and attaches attributes to it, such as ``strides``, ``pads_begin``, ``pads_end``, ``data_format``, etc. .. ref-code-block:: cpp op conv0(0, op::kind::Convolution, {conv0_src_desc, conv0_weight_desc}, {conv0_dst_desc}, "conv0"); conv0.set_attr(op::attr::strides, {4, 4}); conv0.set_attr(op::attr::pads_begin, {0, 0}); conv0.set_attr(op::attr::pads_end, {0, 0}); conv0.set_attr(op::attr::dilations, {1, 1}); conv0.set_attr(op::attr::groups, 1); conv0.set_attr(op::attr::data_format, "NCX"); conv0.set_attr(op::attr::weights_format, "OIX"); Create input/output logical tensors for first ``BiasAdd`` op and create the first ``BiasAdd`` op .. ref-code-block:: cpp logical_tensor conv0_bias_desc {3, data_type::f32}; logical_tensor conv0_bias_add_dst_desc { 4, data_type::f32, layout_type::undef}; op conv0_bias_add(1, op::kind::BiasAdd, {conv0_dst_desc, conv0_bias_desc}, {conv0_bias_add_dst_desc}, "conv0_bias_add"); conv0_bias_add.set_attr(op::attr::data_format, "NCX"); Create output logical tensors for first ``Relu`` op and create the op. .. ref-code-block:: cpp logical_tensor relu0_dst_desc {5, data_type::f32}; op relu0(2, op::kind::ReLU, {conv0_bias_add_dst_desc}, {relu0_dst_desc}, "relu0"); Create input/output logical tensors for second ``Convolution`` op and create the second ``Convolution`` op. .. ref-code-block:: cpp logical_tensor conv1_weight_desc {6, data_type::f32}; logical_tensor conv1_dst_desc {7, data_type::f32}; op conv1(3, op::kind::Convolution, {relu0_dst_desc, conv1_weight_desc}, {conv1_dst_desc}, "conv1"); conv1.set_attr(op::attr::strides, {1, 1}); conv1.set_attr(op::attr::pads_begin, {0, 0}); conv1.set_attr(op::attr::pads_end, {0, 0}); conv1.set_attr(op::attr::dilations, {1, 1}); conv1.set_attr(op::attr::groups, 1); conv1.set_attr(op::attr::data_format, "NCX"); conv1.set_attr(op::attr::weights_format, "OIX"); Create input/output logical tensors for second ``BiasAdd`` op and create the op. .. ref-code-block:: cpp logical_tensor conv1_bias_desc {8, data_type::f32}; logical_tensor conv1_bias_add_dst_desc {9, data_type::f32}; op conv1_bias_add(4, op::kind::BiasAdd, {conv1_dst_desc, conv1_bias_desc}, {conv1_bias_add_dst_desc}, "conv1_bias_add"); conv1_bias_add.set_attr(op::attr::data_format, "NCX"); Create output logical tensors for second ``Relu`` op and create the op .. ref-code-block:: cpp logical_tensor relu1_dst_desc {10, data_type::f32}; op relu1(5, op::kind::ReLU, {conv1_bias_add_dst_desc}, {relu1_dst_desc}, "relu1"); Finally, those created ops will be added into the graph. The graph inside will maintain a list to store all these ops. To create a graph, :ref:`dnnl::engine::kind ` is needed because the returned partitions maybe vary on different devices. For this example, we use CPU engine. .. note:: The order of adding op doesn't matter. The connection will be obtained through logical tensors. Create graph and add ops to the graph .. ref-code-block:: cpp graph g(:ref:`dnnl::engine::kind::cpu `); g.add_op(conv0); g.add_op(conv0_bias_add); g.add_op(relu0); g.add_op(conv1); g.add_op(conv1_bias_add); g.add_op(relu1); After adding all ops into the graph, call :ref:`dnnl::graph::graph::get_partitions() ` to indicate that the graph building is over and is ready for partitioning. Adding new ops into a finalized graph or partitioning a unfinalized graph will both lead to a failure. .. ref-code-block:: cpp g.finalize(); After finished above operations, we can get partitions by calling :ref:`dnnl::graph::graph::get_partitions() `. In this example, the graph will be partitioned into two partitions: #. conv0 + conv0_bias_add + relu0 #. conv1 + conv1_bias_add + relu1 .. ref-code-block:: cpp auto partitions = g.get_partitions(); .. _doxid-graph_cpu_getting_started_cpp_1graph_cpu_getting_started_cpp_compile: Compile and Execute Partition ----------------------------- In the real case, users like framework should provide device information at this stage. But in this example, we just use a self-defined device to simulate the real behavior. Create a :ref:`dnnl::engine `. Also, set a user-defined :ref:`dnnl::graph::allocator ` to this engine. .. ref-code-block:: cpp allocator alloc {}; :ref:`dnnl::engine ` eng = :ref:`make_engine_with_allocator `(:ref:`dnnl::engine::kind::cpu `, 0, alloc); Create a :ref:`dnnl::stream ` on a given engine .. ref-code-block:: cpp :ref:`dnnl::stream ` strm {eng}; Compile the partition to generate compiled partition with the input and output logical tensors. .. ref-code-block:: cpp compiled_partition cp = partition.compile(inputs, outputs, eng); Execute the compiled partition on the specified stream. .. ref-code-block:: cpp cp.execute(strm, inputs_ts, outputs_ts);