So I have a compute shader kernel with the following logic:
[numthreads(64,1,1)]
void CVProjectOX(uint3 t : SV_DispatchThreadID){
if(t.x >= TotalN)
return;
uint compt = DbMap[t.x];
....
I do understand that it's not ideal to have ifs elses/branching in compute shaders? if so, what is the best way to limit thread work if number of total expected threads aren't expected to match exactly the kernel's numthreads?
For instance in my example, the kernel group of 64 threads, let's say I expect total 961 threads (it could be anything really), if, I dispatch 960, 1 db slot won't be processed, if I dispatch 1024, there will be 63 unnecessary work or maybe work pointing to non-existing db slot. (db slots number will vary).
Is if(t.x > TotalN)/return fine and the right approach here? Should I just do min, tx = min(t.x, TotalN) and keep writing on the final db slot? Should I just modulo? tx = t.x % TotalN and rewrite the first db slots?
What other solutions?