Is there anyways to create more than 65535 blocks in CUDA?
If I understand correctly, the maximum number of threads in a block is 1024 (CUDA 8). So, it can form a 2^16(blocks) * 2^10(threads) space.
Is there anyways to create a 2^32 bits space?
What I wanna do is create a total of 2^32 threads in total. A simple example is, I malloced 4GB memory, and I want to fill up the memory with counters from 1 to 0xffffffff.