I have been following a tutorial here http://www.nvidia.com/docs/IO/116711/sc11-cuda-c-basics.pdf
trying to teach myself basic GPU programming. I still don't quite understand the topology of blocks and threads. On page 42 the code defines size data as follows:
#define N (2048*2048)
#define THREADS_PER_BLOCK 512
Is this tutorial making assumptions? I'm currently on a laptop with a Nvidia 520m GPU. using the structure cudaDeviceProp I was able to determine that I am capible of running 1024 threads per block. What exactly does the 2048x2048 quantify? The number of blocks? how do I know if that is correct?