cuda gpu: clarification of terminology

Question

In context of gpu, can someone clarify the difference of meanings between: core, threadprocessor, streamprocessor, multiprocessor.

Also, how to know the limit on active number of blocks per SM for GTX570 GPU (compute capability 2.0)?

Further, device properties show that maximum thread per block in my GPU is 1024, but cuda occupancy calculator does not accept that value for compute capability 2.0 gpu. Is there a new version of cuda occupancy calculator (after 2.1)?

aland aland · Accepted Answer · 2011-12-28T11:41:40

They realate as follows: GPU conststs of several SMs (streaming multiprocessors). Exact number of SMs per GPU depends on what GPU are you using (on low-end cards, only 2 SMs are available, while highend ones have up to 16). Each SM consists of several (8 on pre-Fermi cards (CC 1.x), 32 on Fermi cards (CC 2.x)) cores. I have never heard the term "threadprocessor" before. After some googling, it looks like it's just another word for "core", likely this word was used in early versions of documentation, but then got replaced.
Max. number of blocks per SM is 8 (see Cuda Occupancy Calculator, tab "GPU data", row "Thread Blocks / Multiprocessor")
CUDA Occ. calculator from http://developer.nvidia.com/nvidia-gpu-computing-documentation works fine for me. May be you are using old bugged version.

cuda gpu: clarification of terminology

1 Answers