I'm trying to learn the new features of OpenCL 2.0, and I've created a small kernel in an attempt to demonstrate device-side enqueue. The Kernel is below:
#pragma OPENCL EXTENSION cl_amd_printf : enable
__kernel void call_me(__global int *a);
__kernel void templateKernel(__global unsigned int * output,
__global unsigned int * input,
const unsigned int multiplier);
__kernel void call_me(__global int *a)
{
//do nothing
int id = get_global_id(0);
//a[id] = b[id];
}
__kernel void templateKernel(__global unsigned int * output,
__global unsigned int * input,
const unsigned int multiplier)
{
uint tid = get_global_id(0);
int lid = get_local_id(0);
int gid = get_group_id(0);
int broadcast = 1;
int global_size = get_global_size(0);
if(gid == 0) {
broadcast = work_group_broadcast(5, 0);
}
int collection = work_group_scan_exclusive_add(broadcast);
void (^kernel_block)(void) = ^{call_me(input);};
//output[tid] = input[tid] * multiplier + collection + broadcast;
output[tid] = collection;
//output[tid] = global_size;
size_t size = 100;
//printf("hey %d\n", broadcast);
ndrange_t ndrange = ndrange_1D(size);
queue_t default_queue = get_default_queue();
/*
if(tid == 0){
int status = enqueue_kernel(
default_queue,
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
kernel_block
);
}
*/
}
This kernel is supposed to do nothing, other than have it be a successful call in the kernel, that doesn't result in the program segfaulting. What's wrong with it? The segmentation fault is removed when the enqueue_kernel call is removed. My OpenCL C compiler is set to --cl-std=CL2.0 and is confirmed to be working, since the broadcast and collect functions work properly.
I'm using AMDAPPSDK 3.0 Beta. Any help is appreciated.
enqueueNDRangeKernel()therefore it segfault because the pointer is not valid. Do you have any error in any CL call before the SEG_FAULT? - DarkZeros