I want to get the maximum global work size. I don't want a kernel OpenCL will try to choose the best one for you, which MAY or MAY NOT be the maximum size.
To do this I want to specify the size when call clEnqueueNDRangeKernel
.
e.g:
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global_size, NULL, 0, NULL, NULL);
clGetKernelWorkGroupInfo documentation, indicates :
CL_KERNEL_GLOBAL_WORK_SIZE : This provides a mechanism for the application to query the maximum global size that can be used to execute a kernel (i.e. global_work_size argument to clEnqueueNDRangeKernel) on a custom device given by device or a built-in kernel on an OpenCL device given by device.
How can I get CL_KERNEL_GLOBAL_WORK_SIZE
with OpenCL C++ bindings ?
I do this
cl::array<size_t, 3> kernel_global_work_size = my_kernel.getWorkGroupInfo<CL_KERNEL_GLOBAL_WORK_SIZE>(my_device);
But I got error :
cl2.hpp:5771:12: note: candidate: template<class T> cl_int cl::Kernel::getWorkGroupInfo(const cl::Device&, cl_kernel_work_group_info, T*) const
cl_int getWorkGroupInfo(
^~~~~~~~~~~~~~~~
cl2.hpp:5771:12: note: template argument deduction/substitution failed:
cl2.hpp:5782:9: note: candidate: template<int name> typename cl::detail::param_traits<cl::detail::cl_kernel_work_group_info, name>::param_type cl::Kernel::getWorkGroupInfo(const cl::Device&, cl_int*) const
getWorkGroupInfo(const Device& device, cl_int* err = NULL) const
And with this code
cl::array<size_t, 3> kernel_global_work_size;
my_kernel.getWorkGroupInfo<cl::array<size_t, 3>>(my_device, CL_KERNEL_GLOBAL_WORK_SIZE, &kernel_global_work_size);
I got OpenCL error -30 (Invalid Value)
my_kernel
is not Built-in Kernel
e.g: cl::Kernel my_kernel = cl::Kernel(program, "my_kernel");
my_device
is not Custom device.
e.g: cl::Device device = myDevices[0];