2
votes

I'm writing a Compute Shader (in the unity environment, which uses DirectX11 DirectCompute) which I need to do a very simple task: check whether any pixel in the image has green == 1 and blue > 0.5. (For clarity - the green channel is a 2d "light" and the blue channel is that lights "trail" - I want to detect whenever the light crosses back over its trail.)

I'm as far as displaying the overlap (shown in white) for debugging purposes, but I have no idea how to do something as simple as return a value indicating whether the texture actually contains an overlap. My confusion stems from how threads work. I have a float buffer with room for a single float - I simply need a 1 or 0.

For clarification the following two images show a "before" and "after" - all I need is a single "true" value telling me that some white exists in the second image.

The original image The "overlap"

The compute shader is as follows:

#pragma kernel CSMain

Texture2D<float4> InputTexture;
RWTexture2D<float4> OutputTexture;
RWStructuredBuffer<float> FloatBuffer;

[numthreads(8,8,1)]
void CSMain(uint3 id : SV_DispatchThreadID)
{
    // need to detect any region where g == 1 and blue > 0.5

    float green = InputTexture[id.xy].g;
    float blue = round(InputTexture[id.xy].b);

    float overlap = round((green + blue) / 2.0);

    OutputTexture[id.xy] = float4(overlap, overlap, overlap, 1);

    // answer here?? Note that the output texture is only for debugging purposes
    FloatBuffer[0] = ??
}
1

1 Answers

2
votes

You have the option of using atomic operation and count the pixels. You run your compute shader with one thread per pixel, and if the pixel meet the criteria, increment your rwbuffer.

Something like this :

Texture2D<float4> InputTexture;
RWBuffer<uint> NotAFloatBuffer;

[numthreads(8,8,1)]
void CSMain(uint3 id : SV_DispatchThreadID {
    // need to detect any region where g == 1 and blue > 0.5
    float green = InputTexture[id.xy].g;
    float blue = round(InputTexture[id.xy].b);

    float overlap = round((green + blue) / 2.0);
    if (overlap > 0)
        InterlockedAdd(NotAFloatBuffer[0],1);
}

In your case, you can stop here but atomics have some little cost penalties and often they are optimized by grouping the call from one single thread in your group with prior reduction, but this is only in the most extreme cases, you do not have to worry about that.