I have a question about how shared variables work.
When I declare a shared variable in a kernel like this
__shared__ int array1[N]
every unique shared memory of each active block now has an instance of array1 with size N. Meaning that every shared memory of each active block has now allocated N*sizeof(int) bytes.
And N*sizeof(int) must be at most 16KB for a gpu with compute capability 1.3.
So, assuming the above is correct and using 2D threads and 2D blocks assigned at host like this:
dim3 block_size(22,22);
dim3 grid_size(25,25);
I would have 25x25 instances of array1 with size N*sizeof(int) each and the most threads that could access each shared memory of a block is 22x22. This was my original question and it was answered.
Q: When I assign a value to array1
array1[0]=1;
then do all active blocks assign that value instantly at their own shared memory?