15
votes

I'm using OpenCL on an nvidia GPU and I keep getting CL_INVALID_KERNEL_ARGS when I try to execute a kernel. I've stepped it down to a very simple program:

__kernel void foo(int a, __write_only image2d_t bar)
{
  int 2 coords = {0, get_global_id(0)};
  write_imagef(bar, coords, (float4)a);
}

With the following C program (skipped initialization and error checking bits for brevity)

cl_kernel foo = clCreateKernel(program, "foo", &err);
int a = 42;
clSetKernelArg(foo, 0, sizeof(int), &a);

cl_image_format fmt = {CL_INTENSITY, CL_FLOAT};
cl_mem bar = clCreateImage2D(ctx, CL_MEM_WRITE_ONLY|CL_MEM_ALLOC_HOST_PTR, &fmt, 100, 1, 0, NULL, &err));
clSetKernelArg(foo, 1, sizeof(cl_mem), &bar);

size_t gws[] = {100};
size_t lws[] = {100};
cl_event evt;
clEnqueueNDRangeKernel(queue, foo, 1, NULL, gws, lws, 0, NULL, &evt);
clFinish(queue);

The clEnqueueNDRangeKernel keeps returning CL_INVALID_KERNEL_ARGS. Any ideas?

2
Shouldn't your clSetKernelArg calls be setting kern instead of foo? - KLee1
Also the fourth argument of clEnqueueNDRangeKernel (global_work_offset) must be NULL according to the spec, but you are passing gwo, a pointer to a NULL value. - James Beilby
KLee1 - Sorry, that's a transcription error, I've fixed it. - Trevor
James - I changed that but it had no bearing on the error. Changed it in the sample. - Trevor
I'm always casting the arg_value to (void *) inside clSetKernelArg(). Try that perhaps. - Paul Irofti

2 Answers

5
votes

See https://stackoverflow.com/a/20566270/431528.

How large are the buffer objects you are passing? __constant arguments are allocated from separate memory space and not from global memory so therefore you have probably ran out of constant memory

Check CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE using clGetDeviceInfo to ensure you are not exceeding that size.

4
votes

You are trying to pass a variable on host to kernel. You need to create a cl_mem variable and then copy the value using clEnqueueWriteBuffer, and then pass the cl_mem or cl_int variable to kernel. Other than that your code looks fine to me.