I am accessing global memory to load data to shared memory and would like to know if there is a bank conflict. Here is the setup:
In global memory: g_array
. A 2D matrix of size (256, 64)
This is how I load the array data from global memory to shared memory. I called the kernel with gridDim (4, 1) and blockDim (16, 16).
d_j = (blockIdx%x-1) * blockDim%x + threadIdx%x-1
d_l = (blockIdx%y-1) * blockDim%y + threadIdx%y-1
tIdx = threadIdx%x -1
tIdy = threadIdx%y -1
real, shared :: s_array(0:15,0:15)
s_array(tIdx,tIdy) = g_array(d_j,d_l)
doSomthingwithMySharedMemoryData()
.....