0
votes

The scenario is: I have a texture A. There are 3 loops. Each loop writes a channel into the texture A. After the 3 loops, all 3 channels in A are updated.

The shader is like:

vec3 tmpVec3 = texture(inputTexture0, vUV).rgb; // inputTexture0 is texture A tmpVec3[channelIndex_P] = texture(inputTexture1, vUV).r;  // write one channel from inputTexture1

color = vec4(tmpVec3, 1.0);

It render to texture A, that is inputTexture0. In this way by rendering and sampling the same texture, I can save memory by one texture.

However, the result is not desired.

I read the article "Sampling and Rendering to the Same Texture". It says:

"Meaning it may do what you want, the sampler may get old data, the sampler may get half old and half new data, or it may get garbage data. Any of these are possible outcomes."

But since the writing data of the specific pixel always happens after getting data of the specific pixel, why is it not possible?

Another article "Sampling from and rendering to the same texture and parallel sorting / hashing" It says:

"But if the result in every framebuffer pixel would be the value of one of the fragments which wrote to that pixel and not a combination of values from multiple fragments, this functionality would still be useful for parallel sorting / hashing algorithms implemented in glsl."

I do not understand the above words. Does it say under some situations "sampling and rendering to the same texture" can be used and have defined result. Then how to do it? How to solve my current situation, since I want to save one texture variable.

1
What version of OpenGL are you targeting? GL4 has image Load/Store that will help you with this, but you'll have to implement synchronization yourself.Andon M. Coleman
OpenGL 3.2. My confusion is: here I always write to the same pixel that reading happens before. So why doesn't it work? And what does the second link imply that under some situations it can work?user1914692
You can do that using texture barriers on NV hardware. I don't know if other vendors ever adopted that extension, considering it basically became obsolete when computer shaders and image load/store were added.Andon M. Coleman
Thanks. So would you please explain why my situation cannot work around, since the writing and reading of interest processes are not at the same time.user1914692

1 Answers

1
votes

So would you please explain why my situation cannot work around, since the writing and reading of interest processes are not at the same time.

They absolutely are processed at the same time, I think you are not understanding how shaders are scheduled on the GPU. Fragment shaders do not run in serial, and your attempt to write to the texture will not necessarily complete before a parallel invocation reads the same texture (nor is there any guarantee that the texture memory will not be cached; making changes to the memory invisible).

This is why you need barriers. A texture barrier (NV specific extension) or a general purpose memory barrier will cause your shader invocations to stall at the barrier until every single instance of the shader finishes reading or writing, and then proceed. This will prevent data hazards that would result in reading the wrong data.

To do this in a GL3 class implementation, texture barriers may be your only option. In GL4.2+ image load/store would be the preferred approach, and some older implementations support this through the GL_ARB_shader_image_load_store extension.

Without any of those things, however, what you are trying to do right now invokes undefined behavior. You need some sort of synchronization construct to do this correctly, and that is where barriers come in.