developers,
may someone give me a hint please? I didn't find any information about how to allocate constant and dynamic shared memory in the same kernel, or lets ask more preciously: How to call a kernel where the amount of shared memory that needs to allocated is just partly known at compilation time? Referring to allocating shared memory for example, it becomes pretty obvious how to do for dynamic allocation. But lets assume I have the following kernel:
__global__ void MyKernel(int Float4ArrSize, int FloatArrSize)
{
__shared__ float Arr1[256];
__shared__ char Arr2[256];
extern __shared_ float DynamArr[];
float4* DynamArr1 = (float4*) DynamArr;
float* DynamArr = (float*) &DynamArr1[Float4ArrSize];
// do something
}
Kernel Call:
int SharedMemorySize = Float4ArrSize + FloatArrSize;
SubstractKernel<<< numBlocks, threadsPerBlock, SharedMemorySize, stream>>>(Float4ArrSize, FloatArrSize)
I'm actually wasn't able to figure out how the compiler is linking the size a shared memory only to the part I want to allocate dynamically. Or does the parameter "SharedMemeorySize" represents the total amount of shared memory per block, so I need to calculate in the size of constant memory (int SharedMemorySize = Float4ArrSize + FloatArrSize + 256*sizeof(float)+ 256*sizeof(char)) ?
Please enlighten me or just simply point to some code snippets. Thanks a lot in advance.
cheers greg
__shared__
variable. Try to combine everything into a single struct. - Soroosh Bateni__shared__
variable. - Soroosh Bateni