Xilinx corp. released some programs for working with its DMA PCIe IPcore which is available here. Whitin these programs, there are some programs that performs C2H or vise versa high speed operations.
Is it basically using DMA in host side or something redundant must be written in these codes? For instance, the code snippet that does writing/reading in AXI-ST mode is as follows:
void do_transfers_in_parallel(unsigned index, device_file& h2c, device_file& c2h,
std::array<uint32_t, array_size>& h2c_data,
std::array<uint32_t, array_size>& c2h_data) {
std::cout << " Initiating H2C_" << index << " transfer of " << h2c_data.size() * sizeof(uint32_t) << " bytes...\n";
std::thread read_thread(&device_file::write, &h2c, (void*)h2c_data.data(),
h2c_data.size() * sizeof(uint32_t));
std::cout << " Initiating C2H_" << index << " transfer of " << c2h_data.size() * sizeof(uint32_t) << " bytes...\n";
std::thread write_thread(&device_file::read, &c2h, (void*)c2h_data.data(),
c2h_data.size() * sizeof(uint32_t));
write_thread.join();
read_thread.join();
}