I'm converting OpenCL code from my Mac to a Linux box with an NVIDIA Tesla K20c card and have run into a snag when building a simple kernel. My kernel code is this:
char kernel[1024] =
"#pragma OPENCL EXTENSION cl_khr_fp64: enable \
\
kernel void diff(global double* u, \
int N, \
double dx, \
global double* du) \
{ \
size_t i = get_global_id(0); \
int ip = (i+1)%N; \
int im = (i+N-1)%N; \
du[i] = (u[ip] - u[im])/dx/2.; \
}";
I call this with:
const char* srccode = kernel;
cl_program program = clCreateProgramWithSource(context, 1, &srccode, NULL, &err);
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
kernel = clCreateKernel(program, "diff", &err);
clBuildProgram returns CL_SUCCESS and the log from clBuildProgramInfo is empty, but clCreateKernel returns CL_INVALID_KERNEL_NAME. Any idea why? I've been banging at this a while and can't find anything. If I change all the doubles to floats and remove the pragma the problem goes away and it works correctly. So is the pragma to blame? If so, how do I do it correctly?