0
votes

I'm trying to figure out how to do pixel sorting on the GPU. As I understood through the different tutorials I did, the fragment shader is not able to use or manipulate the pixels around the current pixel it is working with. But there are blur shaders that are able to do a gaussian blur by averaging out the pixels around the currently processing pixel.

How do I have to implement such an algorithm, do I need to feed the output of the current shader into itself or another one?

Thanks in advance

2
Do you want to use OpenGL or GPU? For this case you can better to use OpenCL or CUDA. Do want to implement median filter? - Unick
Doesn't really matter, I'm using openframeworks, so everything that is supported in there would be great :D. - polyclick

2 Answers

1
votes

But there are blur shaders that are able to do a gaussian blur by averaging out the pixels around the currently processing pixels

You can't write to arbitrary pixels, but through a sampler (texture or image access) you can read arbitrary pixels, which is all you need for a blur shader.

1
votes

GPU Gems 2 Chapter 46 describes sorting on the GPU.

Consider how you might use the Bubble-sort algorithm:

By accessing all N pixels of the column to be sorted per each output pixel, it will be very slow, since each output pixel reads N (with N^2 accesses - heavily affected by caching) and that's a lot of memory access. But it will work, and this code may actually make the most sense.

If you access only 2 input pixels per output pixel then you can effectively "swap" a pair of adjacent pixels on each pass. This may require up to N passes to displace a pixel N steps away. (You can count swaps and "break" early by using an occlusion query)

By combining techniques you can sample k pixels at a time, allowing a pixel to move up to k steps per pass, thus requiring N/k passes and make more efficient use of resources.

But the real high-performance solutions come from implementing smarter algorithms than just bubble sort. The GPU Gems article suggests a bitonic merge sort, and even heapsort and quicksort are possible with much cleverness.

But there's a lot to know to successfully craft an efficient implementation for a GPU algorithm--more than can be quickly discussed generally here. And you may want to go to a CUDA implementation rather than glsl. But then there's bank conflicts, warps, caching, sampling, swizzling. There's no quick answer.

I suggest you give it a try and then ask a more specific question when you get stuck.