GPU Architecture (Nvidia)

Question

In all the papers i am reading i see that the GPU is made up of multiprocessors and each multiprocessor has 8 processors which are capable of executing a single warp in parallel.
The GPU i am using is Nvidia 560, it has only 7 multiprocessors but 48 processors in each multiprocessor. does this mean that every multiprocessor in the Nvidia 560 is able to execute 6 warps in parallel?
Can i say that the max number of threads executed in parallel on Nvidia 560 is 32*6*7=1344 threads in parallel? (32=warp , 7=multipricessors , 6=warps executed in parallel)

How many multiprocessors is in the fastest Nvidia GPU? what is this GPU? What is the maximum amount of global memory does the biggest GPU have?

aland aland · Accepted Answer · 2012-05-15T06:42:58

From CUDA Programming Guide 4.2:

[...] at every instruction issue time, a warp scheduler selects a warp that has threads ready to execute its next instruction (the active threads of the warp) and issues the instruction to those threads.

So, the maximum number of concurrent running waprs per SM is equal to the number of warp schedulers (WS).

GeForce 580 has 2.1 architecture:

For devices of compute capability 2.x, a multiprocessor consists of: [...] 2 warp schedulers

This means, each SM of your GPU can run 2 warps = 64 threads concurrently, making it 448 threads total. Please note, however, that it's highly recommended to use much more threads than that:

The number of clock cycles it takes for a warp to be ready to execute its next instruction is called the latency, and full utilization is achieved when all warp schedulers always have some instruction to issue for some warp at every clock cycle during that latency period, or in other words, when latency is completely “hidden”.

Regarding your other questions: GeForce GTX690 has 3072 CUDA Cores. However, for CUDA it would seem like two separate GPUs with 1536 cores each, so it's not better then two GeForce 680, and the latter is easily overclocked judging by numerous online reviews. The largest memory among GPUs is installed in nVidia Tesla M2090: 6GiB of GDDR5 (512 CUDA Cores). I guess, soon the new family of Teslas, based on Kepler architecture like GeForce 6xx, will be released, but I haven't heard of any official announces.

GPU Architecture (Nvidia)

2 Answers