In context of gpu, can someone clarify the difference of meanings between: core, threadprocessor, streamprocessor, multiprocessor.
Also, how to know the limit on active number of blocks per SM for GTX570 GPU (compute capability 2.0)?
Further, device properties show that maximum thread per block in my GPU is 1024, but cuda occupancy calculator does not accept that value for compute capability 2.0 gpu. Is there a new version of cuda occupancy calculator (after 2.1)?