1
votes

I'm trying to learn the new features of OpenCL 2.0, and I've created a small kernel in an attempt to demonstrate device-side enqueue. The Kernel is below:

#pragma OPENCL EXTENSION cl_amd_printf : enable

__kernel void call_me(__global int *a);

__kernel void templateKernel(__global  unsigned int * output,
                         __global  unsigned int * input,
                         const     unsigned int multiplier);


__kernel void call_me(__global int *a)
{   
//do nothing
int id = get_global_id(0);
//a[id] = b[id];
}



__kernel void templateKernel(__global  unsigned int * output,
                         __global  unsigned int * input,
                         const     unsigned int multiplier)
{
uint tid = get_global_id(0);
int lid = get_local_id(0);
int gid = get_group_id(0);
int broadcast = 1;
int global_size = get_global_size(0);
if(gid == 0) {
    broadcast = work_group_broadcast(5, 0);
}
int collection = work_group_scan_exclusive_add(broadcast);  


void (^kernel_block)(void) = ^{call_me(input);};
//output[tid] = input[tid] * multiplier + collection + broadcast;
output[tid] = collection;
//output[tid] = global_size;
size_t size = 100;
//printf("hey %d\n", broadcast);
ndrange_t ndrange = ndrange_1D(size);
queue_t default_queue = get_default_queue();
/*
if(tid == 0){ 
    int status = enqueue_kernel(
        default_queue, 
        CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
        ndrange,
        kernel_block            
        );
    }
*/
}

This kernel is supposed to do nothing, other than have it be a successful call in the kernel, that doesn't result in the program segfaulting. What's wrong with it? The segmentation fault is removed when the enqueue_kernel call is removed. My OpenCL C compiler is set to --cl-std=CL2.0 and is confirmed to be working, since the broadcast and collect functions work properly.

I'm using AMDAPPSDK 3.0 Beta. Any help is appreciated.

1
Is it even compiling? A typical problem could be that you are passing a non compiled kernel to enqueueNDRangeKernel() therefore it segfault because the pointer is not valid. Do you have any error in any CL call before the SEG_FAULT? - DarkZeros
I have checks along the way for every API call. They read the status int. clBuildProgram returns okay, as well as everything else before enqueueNDRangeKernel. The funny thing is that on this ubuntu box, while running watch -n 0.1 ./executable , the build program status message isn't shown. - billyc59

1 Answers

2
votes

I have solved my own problem.

The issue was that in OpenCL 2.0, the API call to create command queues clCreateCommandQueue() has been deprecated. Instead AMD suggests that one should use the new API call clCreateCommandQueueWithProperties() in order to enable device-side queues, for the device-side kernel calls.

In addition to using the new API call, one must also make at least 2 command queues. One for the host-side, and one for the device-side. The device queue is made in the host, using the additional properties that come with the new API call.