1
votes

I try to implement a simple algorithm in preperation of a more complex one. I want to call a kernel several times and it shall increment each value within an array by let's say 5 in each call.

So when I have initially the array [1,2,3,4] I want [6,7,8,9] after the first call and [11,12,13,14] after the second call and so on. But I don't unterstand how to configure my buffers and how to enqueue my buffer in that case. I tried to orient at this tutorial: http://www.browndeertechnology.com/docs/BDT_OpenCL_Tutorial_NBody-rev3.html (this is the algorithm I want to implement in the end with some modifications) but the library used there hides the most important aspects.

At the moment I create my buffer with:

pos2g_buf = clCreateBuffer(
context,
CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_float) * (nparticle*4),
pos2g,
&status);

The calls of the Kernel are placed within a for-loop

for(int i=0; i

And there I set the kernel arguments and call it by:

  status = clEnqueueNDRangeKernel(                                                                                                                                    
      oclm->commandQueue,                                                                                                                              
                              kernel,                                                                                                                                    
                              NDRangeDimension,                                                                                                                                         
                              NULL,                                                                                                                                      
                              globalThreads,                                                                                                                             
                              localThreads,                                                                                                                              
                              0,                                                                                                                                         
                              NULL,                                                                                                                                      
                              &events[0]); // 

Can someone please help me and give the correct (pseudo-) code, how to create my simple iterator program?

Many thanks in advance! Michael

1

1 Answers

3
votes

On Host side:

cl_mem buffer = clCreateBuffer(..., CL_MEM_READ_WRITE, ...);
cl_kernel kernel = clCreateKernel(...);

clSetKernelArg(.., kernel, buffer, ...);

for(int i=0; i<num_laps; i++){
    clEnqueueNDRangeKernel(..., kernel, ...);
}

void *host_mem = malloc(...);
clEnqueueReadBuffer(..., buffer, ..., host_mem, ...);

On Device side:

void __kernel my(global int* mem)
{
    mem[get_global_id(0) += 5;
    return;
}

Don't forget to check return codes and release resources.