14
votes

I have successfully implemented a simple 2-d game using lwjgl (opengl) where objects fade away as they get further away from the player. This fading was initially implemented by computing distance to origin of each object from the player and using this to scale the objects alpha/opacity.

However when using larger objects, this approach appears a bit too rough. My solution was to implement alpha/opacity scaling for every pixel in the object. Not only would this look better, but it would also move computation time from CPU to GPU.

I figured I could implement it using an FBO and a temporary texture.
By drawing to the FBO and masking it with a precomputed distance map (a texture) using a special blend mode, I intended to achieve the effect. The algorithm is like so:

0) Initialize opengl and setup FBO
1) Render background to standard buffer
2) Switch to custom FBO and clear it
3) Render objects (to FBO)
4) Mask FBO using distance-texture
5) Switch to standard buffer
6) Render FBO temporary texture (to standard buffer)
7) Render hud elements

A bit of extra info:

  • The temporary texture has the same size as the window (and thus standard buffer)
  • Step 4 uses a special blend mode to achieve the desired effect:
    GL11.glBlendFunc( GL11.GL_ZERO, GL11.GL_SRC_ALPHA );
  • My temporary texture is created with min/mag filters: GL11.GL_NEAREST
  • The data is allocated using: org.lwjgl.BufferUtils.createByteBuffer(4 * width * height);
  • The texture is initialized using: GL11.glTexImage2D( GL11.GL_TEXTURE_2D, 0, GL11.GL_RGBA, width, height, 0, GL11.GL_RGBA, GL11.GL_UNSIGNED_BYTE, dataBuffer);
  • There are no GL errors in my code.

This does indeed achieve the desired results. However when I did a bit of performance testing I found that my FBO approach cripples performance. I tested by requesting 1000 successive renders and measuring the time. The results were as following:

In 512x512 resolution:

  • Normal: ~1.7s
  • FBO: ~2.5s
  • (FBO -step 6: ~1.7s)
  • (FBO -step 4: ~1.7s)

In 1680x1050 resolution:

  • Normal: ~1.7s
  • FBO: ~7s
  • (FBO -step 6: ~3.5s)
  • (FBO -step 4: ~6.0s)

As you can see, this scales really badly. To make it even worse, I'm intending to do a second pass of this type. The machine I tested on is supposed to be high end in terms of my target audience, so I can expect people to have far below 60 fps with this approach, which is hardly acceptable for a game this simple.

What can I do to salvage my performance?

1
How exactly should the effect look like? Fading out at the borders or something similar?Gunther Piez
The distance map is currently implemented using double dist = Math.hypot(nRadius-i,nRadius-j); double a = Math.max( 0, 1 - dist / nRadius );. Where a*a is used as alpha.Scarzzurs
Where i,j iterate over all pixels in the object? I can not answer why your FBO operation is slow, but I would implement it using multitexturing with you distance map as the second texture and maybe additionally scale alpha, if a certain distance is reachedGunther Piez
This is an example where the fragment shader is jumping up and down, shouting "please, please, please, use meeeeee!" :-) Although that's a serious change in the way you have to think and program, you should consider it nevertheless. Squared distance to gl_Fragcoord.xy from some point that you pass as uniform boils down to one vector subtract and one dot product. If you want linear attenuation, it's another recp and multiply, but still it's as easy as can be and will be super super fast (around a dozen cycles, no extra texture, no extra blending, no obscure blend modes).Damon
I agree with Damon. Your FBO solution is clever, but ultimately it will always be slow compared to a shader-based approach. It's best to only use FBO's for rendering to textures that don't have to change every frame.sidewinderguy

1 Answers

5
votes

As suggested by Damon and sidewinderguy I successfully implemented a similar solution using a fragment shader (and vertex shader). My performance is little bit better than my initial cpu-run object-based computation, which is MUCH faster than my FBO-approach. At the same time it provides visual results much closer to the FBO-approach (Overlapping objects behave a bit different).

For anyone interested the fragment shader basically transforms the gl_FragCoord.xy and does a texture lookup. I am not sure this gives the best performance, but with only 1 other texture activated I do not expect performance to increase by omitting the lookup and computing the texture value directly. Also, I now no longer have a performance bottleneck, so further optimizations should wait till it is found to be required.

Also, I am very grateful for the all the help, suggestions and comments I received :-)