I'm trying to perform some basic cellular automata on compute shader (DirectCompute) but without double buffering, so I'm using unordered access view to a RWTexture2D<uint>
for the data, however I'm having some really strange hang/crash here, I could make a very small snippet that produces the issue:
int w = 256;
for (int x = 0; x < w; ++x)
{
for (int y = 1; y < w; ++y)
{
if (map[int2(x, y - 1)])
{
map[int2(x, y)] = 10;
map[int2(x, y-1)] = 30;
}
}
}
where map
is RWTexture2D<uint>
.
If I remove the if
or one of the assignments, it works, I thought it could be some kind of limit so I tried looping just 1/4 of the texture but the problem persists. That code is dispatched with (1,1,1) and kernel numthreads is (1,1,1) too, in my real-world scenario I want to loop from bottom to top and fill the voids (0
) with the pixel I'm currently looping (think of a "falling sand" kind of effect), so it can't be parallel except in columns since it depends on the bottom pixel.
I don't understand what is causing the shader to hang though, there's no error or anything, it simply hangs and never not even times out.
EDIT:
After some further investigation, I came across something really intriguing; when I pass that w
value in a constant buffer it all works fine. I have no idea what would cause that, maybe it's some compiling optimization that went wrong, maybe it tries to unroll the loop what causes some issue, and passing the value in a constant buffer disables that, however I'm compiling the shaders in debug with no optimization so I don't know.