0
votes

I came across this thread Turning off coalescing in Nvidia Forum where it is stated that "So far setting a memory pointer/array to "volatile" seems to help for massively random access. (Gives 50% more performance ?!)".

I am doing finite difference computation (3D Stencil Computation) on GPU (Fermi) using CUDA and want to improve the performance of the computation. Since accessing the z axis of 3D array is random (3D array is laid in Z,Y,X from slow to fast), I feel like using volatile would be a better choice. Currently I am using shared memory

__shared__ float 2dplane[32][32]

When I tried using volatile as

volatile float **plane = 2dplane;

, I get this error

error: a value of type "float ()[16]" cannot be used to initialize an entity of type "volatile float *"

Can anyone tell me how to use volatile on 2d array[code example will be helpful]. Moreover, it would be great if someone tells me how much performance gain can I expect.

1

1 Answers

4
votes

Your shared memory array isn't 2D and isn't declared volatile, which is why the assignment statement is illegal.

As for the volatile suggestion from the NVIDIA boards, all I can say is that you shouldn't believe everything your read, because it is nonsense (Google "skybuck flying" if you dare). The volatile keyword controls how and whether the compiler enforces writes from register to memory. It will have no beneficial effect on memory throughput.