Organizing CUDA kernels

Question

I am writing a very very long CUDA kernel, and it is pretty awful for human readability. Is there any way to organize CUDA kernels with functions for example outside of the kernel? Example:

__global__ void CUDA_Kernel(int* a, int* b){
     //calling function 1
     //calling function 2
     //calculation function
         .......
}

Steve Steve · Accepted Answer · 2011-06-29T14:50:21

A function can be called from inside a kernel if it is defined using the __device__ keyword.

For example:

__device__ int test_fun(int val)
{
   return 2*val + 3;
}


__global__ void kern_test( int * data) 
{
   int aOffset = blockDim.x * blockIdx.x + threadIdx.x;
   data[offset] = test_fun(data[offset]);
}

Organizing CUDA kernels

2 Answers