Data storage

Data storage#

The usage of prepended namespace specifiers oneapi::mkl::dft is omitted below for conciseness.

The data storage convention observed by a descriptor object depends on whether it is a real or complex descriptor and, in case of complex descriptors, on the configuration value associated with configuration parameter config_param::COMPLEX_STORAGE.

Complex descriptors#

For a complex descriptor, the configuration parameter config_param::COMPLEX_STORAGE specifies how the entries of the complex data sequences it consumes and produces are stored. If that configuration parameter is associated with a configuration value config_value::COMPLEX_COMPLEX (default behavior), those entries are accessed and stored as std::complex<float> (resp. std::complex<double>) elements of a single data container (device-accessible USM allocation or sycl::buffer object) if the descriptor object is a single-precision (resp. double-precision) descriptor. If the configuration value config_value::REAL_REAL is used instead, the real and imaginary parts of those entries are accessed and stored as float (resp. double) elements of two separate, non-overlapping data containers (device-accessible USM allocations or sycl::buffer objects) if the descriptor object is a single-precision (resp. double-precision) descriptor.

These two behaviors are further specified and illustrated below.

config_value::COMPLEX_COMPLEX for config_param::COMPLEX_STORAGE

For complex descriptors with parameter config_param::COMPLEX_STORAGE set to config_value::COMPLEX_COMPLEX, each of forward- and backward-domain data sequences must belong to a single data container (device-accessible USM allocation or sycl::buffer object). Any relevant entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) is accessed/stored from/in a data container provided at compute time at the index value expressed in eq. (1) (see the page dedicated to the configuration of data layout) of that data container, whose elementary data type is (possibly implicitly re-interpreted as) std::complex<float> (resp. std::complex<double>) for single-precision (resp. double-precision) descriptors.

The same unique data container is to be used for forward- and backward-domain data sequences for in-place transforms (for descriptor objects with configuration value config_value::INPLACE for configuration parameter config_param::PLACEMENT). Two separate data containers sharing no common elements are to be used for out-of-place transforms (for descriptor objects with configuration value config_value::NOT_INPLACE for configuration parameter config_param::PLACEMENT).

The following snippet illustrates the usage of config_value::COMPLEX_COMPLEX for configuration parameter config_param::COMPLEX_STORAGE, in the context of in-place, single-precision (fp32) calculations of \(M\) three-dimensional \(n_1 \times n_2 \times n_3\) complex transforms, using identical (default) strides and distances in forward and backward domains, with USM allocations.

namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::COMPLEX> desc({n1, n2, n3});
std::vector<std::int64_t> strides({0, n2*n3, n3, 1});
std::int64_t dist = n1*n2*n3;
std::complex<float> *Z = (std::complex<float> *) malloc_device(2*sizeof(float)*n1*n2*n3*M, queue);
desc.set_value(dft::config_param::FWD_STRIDES, strides);
desc.set_value(dft::config_param::BWD_STRIDES, strides);
desc.set_value(dft::config_param::FWD_DISTANCE, dist);
desc.set_value(dft::config_param::BWD_DISTANCE, dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.set_value(dft::config_param::COMPLEX_STORAGE, dft::config_value::COMPLEX_COMPLEX);
desc.commit(queue);

// initialize forward-domain data such that entry {m;k1,k2,k3}
//   = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
auto ev = compute_forward(desc, Z); // complex-to-complex in-place DFT
// Upon completion of ev, in backward domain: entry {m;k1,k2,k3}
//   = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]

config_value::REAL_REAL for config_param::COMPLEX_STORAGE

For complex descriptors with parameter config_param::COMPLEX_STORAGE set to config_value::REAL_REAL, forward- and backward-domain data sequences are read/stored from/in two different, non-overlapping data containers (device-accessible USM allocations or sycl::buffer objects) encapsulating the real and imaginary parts of the relevant entries separately. The real and imaginary parts of any relevant complex entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) are both stored at the index value expressed in eq. (1) (see the page dedicated to the configuration of data layout) of their respective data containers, whose elementary data type is (possibly implicitly re-interpreted as) float (resp. double) for single-precision (resp. double-precision) descriptors.

The same two data containers are to be used for real and imaginary parts of forward- and backward-domain data sequences for in-place transforms (for descriptor objects with configuration value config_value::INPLACE for configuration parameter config_param::PLACEMENT). Four separate data containers sharing no common elements are to be used for out-of-place transforms (for descriptor objects with configuration value config_value::NOT_INPLACE for configuration parameter config_param::PLACEMENT).

The following snippet illustrates the usage of config_value::REAL_REAL set for configuration parameter config_param::COMPLEX_STORAGE, in the context of in-place, single-precision (fp32) calculation of \(M\) three-dimensional \(n_1 \times n_2 \times n_3\) complex transforms, using identical (default) strides and distances in forward and backward domains, with USM allocations.

namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::COMPLEX> desc({n1, n2, n3});
std::vector<std::int64_t> strides({0, n2*n3, n3, 1});
std::int64_t dist = n1*n2*n3;
float *ZR = (float *) malloc_device(sizeof(float)*n1*n2*n3*M, queue); // data container for real parts
float *ZI = (float *) malloc_device(sizeof(float)*n1*n2*n3*M, queue); // data container for imaginary parts
desc.set_value(dft::config_param::FWD_STRIDES, strides);
desc.set_value(dft::config_param::BWD_STRIDES, strides);
desc.set_value(dft::config_param::FWD_DISTANCE, dist);
desc.set_value(dft::config_param::BWD_DISTANCE, dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.set_value(dft::config_param::COMPLEX_STORAGE, dft::config_value::REAL_REAL);
desc.commit(queue);

// initialize forward-domain data such that the real part of entry {m;k1,k2,k3}
//   = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
//   = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
auto ev = compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
// Upon completion of ev, in backward domain: the real part of entry {m;k1,k2,k3}
//   = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
//   = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]

Real descriptors#

Real descriptors observe only one type of data storage. Any relevant (real) entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) of a data sequence in forward domain is accessed and stored as a float (resp. double) element of a single data container (device-accessible USM allocation or sycl::buffer object) if the descriptor object is a single-precision (resp. double-precision) descriptor. Any relevant (complex) entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) of a data sequence in backward domain is accessed and stored as a std::complex<float> (resp. std::complex<double>) element of a single data container (device-accessible USM allocation or sycl::buffer object) if the descriptor object is a single-precision (resp. double-precision) descriptor.

The following snippet illustrates the usage of a real, single-precision descriptor (and the corresponding data storage) for the in-place, single-precision (fp32), calculation of \(M\) three-dimensional \(n_1 \times n_2 \times n_3\) real transforms, using default strides in forward and backward domains, with USM allocations.

namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::REAL> desc({n1, n2, n3});
// Note: integer divisions here below
std::vector<std::int64_t> fwd_strides({0, 2*n2*(n3/2 + 1), 2*(n3/2 + 1), 1});
std::vector<std::int64_t> bwd_strides({0,   n2*(n3/2 + 1),   (n3/2 + 1), 1});
std::int64_t fwd_dist = 2*n1*n2*(n3/2 + 1);
std::int64_t bwd_dist =   n1*n2*(n3/2 + 1);
float *data = (float *) malloc_device(sizeof(float)*fwd_dist*M, queue); // data container
desc.set_value(dft::config_param::FWD_STRIDES, fwd_strides);
desc.set_value(dft::config_param::BWD_STRIDES, bwd_strides);
desc.set_value(dft::config_param::FWD_DISTANCE, fwd_dist);
desc.set_value(dft::config_param::BWD_DISTANCE, bwd_dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.commit(queue);

// initialize forward-domain data such that real entry {m;k1,k2,k3}
//   = data[ fwd_strides[0] + k1*fwd_strides[1] + k2*fwd_strides[2] + k3*fwd_strides[3] + m*fwd_dist ]
auto ev = compute_forward(desc, data); // real-to-complex in-place DFT
// In backward domain, the implicitly-assumed type is complex so, consider
//   std::complex<float>* complex_data = static_cast<std::complex<float>*>(data);
// upon completion of ev, the backward-domain entry {m;k1,k2,k3} is
//   = complex_data[ bwd_strides[0] + k1*bwd_strides[1] + k2*bwd_strides[2] + k3*bwd_strides[3] + m*bwd_dist ]
//   for 0 <= k3 <= n3/2.
//   Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} is not stored explicitly
//   since it is equal to std::conj(entry {m;n1-k1,n2-k2,n3-k3})

Parent topic DFT-related scoped enumeration types