3
votes

I have a post processing pipeline that uses a compute shader to process a texture and writes it to a RWByteAddressBuffer.

The content of the RWByteAddressBuffer is then sent to an FPGA device via direct memory access (AMD DirectGMA technology). Meaning, I initiate an external device to access the physical bytes of this buffer without Direct3D api knowing about it.

Here is the essence of the code:

_context->CSSetShaderResources(0,1,_nonMsaaSrv.GetAddressOf());
_context->CSSetUnorderedAccessViews(0, 1, _unorderedAccessView.GetAddressOf(),nullptr);
_context->CSSetShader(_converter.Get(),0,0);
_context->Dispatch(1920, 1200, 1);

// ... wait for direct3d compute shader to finish processing?
// send the bytes to the fpga:
_dmaController->StartDMA(_d3dBufferPhysicalAddress, fpgaLogicalAddress);

Everything works, but the problem is I could not find a way to block the thread or get an event that indicates that the compute shader completed its work on the GPU.

This question suggests a solution that uses ID3D11Query to do some kind of polling. but it is my understanding that this is simply a busy wait. I was hoping to find a better solution that might allow the thread to block by waiting for some kind of event. With APIs such as Cuda / OpenCL this is pretty trivial.

So is it possible to do a blocking wait for a compute shader in direct3D 11? if so how?

2

2 Answers

4
votes

If there is no need to support Windows 7 / 8, it is possible to achieve this using the updated interfaces ID3D11Device5, ID3D11DeviceContext4 & ID3D11Fence that are available on Windows 10 v1703 and later.

Creating the fence object:

HR(_d3dDevice->CreateFence(0, D3D11_FENCE_FLAG_NONE, __uuidof(ID3D11Fence), reinterpret_cast<void**>(_syncFence.GetAddressOf())));

In the processing loop, we dispatch the compute shader, and enqueue a signal with incremented counter right after it:

++_syncCounter;
_context->Dispatch(1920, 1200, 1);
HR(_context->Signal(_syncFence.Get(), _syncCounter));
HR(_syncFence->SetEventOnCompletion(_syncCounter,_syncEvent.get()));  

// wait for the event (could be on a different thread)

_syncEvent.wait(); // WaitForSingleObject

Examples (for Direct3D12 though) can be found here.

0
votes

The ID3D11Query is the mechanism you're looking for; there's not anything event-based in Direct3D 11. It's a polling mechanism but not the same as a normal busy wait on the CPU.

You can always profile it to see what load it adds, especially if you add a delay to check query->GetData at various intervals (10ms, 100ms, etc) to see if your performance improves.