I'm new in GPGPU programming and I'm working with NVIDIA implementation of OpenCL.
My question was how to compute the limit of a GPU device (in number of threads).
From what I understood a there are a number of work-group (equivalent of blocks in CUDA) that contain a number of work-item (~ cuda thread).
How do I get the number of work-group present on my card (and that can run at the same time) and the number of work-item present on one work group?
To what CL_DEVICE_MAX_COMPUTE_UNITS corresponds?
The khronos specification speeks of cores ("The number of parallel compute cores on the OpenCL device.") what is the difference with the CUDA core given in the specification of my graphic card. In my case openCL gives 14 and my GeForce 8800 GT has 112 core based on NVIDIA website.Does CL_DEVICE_MAX_WORK_GROUP_SIZE (512 in my case) corresponds to the total of work-items given to a specific work-group or the number of work-item that can run at the same time in a work-group?
Any suggestions would be extremely appreciated.