I need to know something about CUDA shared memory. Let's say I assign 50 blocks with 10 threads per block in a G80 card. Each SM processor of a G80 can handle 8 blocks simultaneously. Assume that, after doing some calculations, the shared memory is fully occupied.
What will be the values in shared memory when the next 8 new blocks arrive? Will the previous values reside there? Or will the previous values be copied to global memory and the shared memory refreshed for next 8 blocks?