1
votes

I thought that shared memory of a cuda-device is private to a block. However, it seems to me that the pointer of shared memory across two blocks is identical:

#include <stdio.h>

__global__ void foo() {
    __shared__ int ar[8];
    printf("shared memory pointer %p at blockidx %i\n", ar, blockIdx.x);
}

int main() {
    dim3 blockDim(1);
    dim3 gridDim(2);
    foo<<<gridDim, blockDim>>>();
    cudaDeviceSynchronize();
}

Running to code above produces:

shared memory pointer 0x7f88f5000000 at blockidx 0
shared memory pointer 0x7f88f5000000 at blockidx 1

With this program, I expected to create two different blocks, initialize shared memory on each block and obtain two different locations for the memory. Am I misunderstanding something? Or do these pointer indeed have a different physical location but within a block the addresses seem to be the same?

1

1 Answers

3
votes

Shared memory is block-private, i.e. threads from one block cannot access another block's shared memory.

... for this very reason, it's actually to be expected that the address range for shared memory will be the same for all blocks - but in each block, loading from or storing to these addresses affects the block-local shared memory.

For intuition: This is somewhat similar to how, on the CPU, code in two processes may use identical pointer addresses but they will actually access different physical locations in memory (usually).