class dnnl::graph::compiled_partition

class dnnl::graph::compiled_partition#

Overview#

A compiled partition object. More…

#include <dnnl_graph.hpp>

class compiled_partition: public compiled_partition_handle
{
public:
    // construction

    compiled_partition();
    compiled_partition(dnnl_graph_compiled_partition_t compiled_partition);

    // methods

    logical_tensor query_logical_tensor(size_t tid) const;
    std::vector<std::pair<size_t, size_t>> get_inplace_ports() const;
    logical_tensor get_scratchpad_logical_tensor() const;

    void execute(
        stream& astream,
        const std::vector<tensor>& inputs,
        const std::vector<tensor>& outputs,
        const tensor& scratchpad = tensor()
        ) const;
};

Detailed Documentation#

A compiled partition object.

Construction#

compiled_partition()

Default constructor. Constructs an empty object.

compiled_partition(dnnl_graph_compiled_partition_t compiled_partition)

Constructs a compiled partition object.

Methods#

logical_tensor query_logical_tensor(size_t tid) const

Queries an input or output logical tensor according to tensor ID. If the tensor ID doesn’t belong to any input or output of the compiled partition, an exception will be raised by the API.

Parameters:

tid

The unique id of required tensor.

Returns:

The logical tensor.

std::vector<std::pair<size_t, size_t>> get_inplace_ports() const

Returns the hint of in-place pairs from a compiled partition. It indicates that an input and an output of the partition can share the same memory buffer for computation. In-place computation helps to reduce the memory footprint and improves cache locality. But since the library may not have a global view of user’s application, it’s possible that the input tensor is used at other places in user’s computation graph. In this case, the user should take the in-place pair as a hint and pass a different memory buffer for output tensor to avoid overwriting the input memory buffer which will probably cause unexpected incorrect results.

Returns:

A list of pairs of input and output IDs.

logical_tensor get_scratchpad_logical_tensor() const

Returns the scratchpad logical tensor describing the required scratchpad buffer for execution. The logical tensor has data type u8 and strided layout with a single dimension equal to the scratchpad size in bytes.

Returns:

A logical tensor describing the scratchpad.

void execute(
    stream& astream,
    const std::vector<tensor>& inputs,
    const std::vector<tensor>& outputs,
    const tensor& scratchpad = tensor()
    ) const

Execute a compiled partition.

Note

The user can provide a scratchpad tensor for execution. If not provided, the library will allocate an internal scratchpad buffer for the execution. For user-provided scratchpad tensor, the size is determined by the logical tensor returned by the get_scratchpad_logical_tensor API. The user is responsible for the memory management of the user-provided scratchpad tensor, including allocation, deallocation, and thread-safety.

Parameters:

astream

Stream object to run over.

inputs

A list of input tensors.

outputs

A list of output tensors.

scratchpad

User-provided scratchpad tensor.