0
votes

I have an opencl kernel that computes two global buffers in two loops. The first loop does some computations with a global thread and writes the result to the output buffer "OutBuff". Then the second loop updates the values of the global buffer "UpdateBuff" according to the results computed in "OutBuff" in the first loop(on the previous level). The prolem is that the global thread between the two loops changed since the threads are executed in parallel. But in my case, I need to keep the order of thread execution between these two loops. I need to compute the two loops with the same global id. for example

__kernel void globalSynch(__global double4* input,__global uint *points,__global double4* OutBuff,__global double4* UpdateBuff)
{
int gid = get_global_id(0);
uint pt;
for(int level=0;level<N;level++)
{
for(int i=0;i<blocksize;i++)
{
  pt== points[gid*i*level];
  OutBuff[pt]= do_some_computations(UpdateBuff,....);
}
barrier( CLK_GLOBAL_MEM_FENCE);
for(int j=0;j<blocksize1;j++)
{
 pt=points[gid*j*(level+1)];
  UpdateBuff[pt]= do_some_computations(OutBuff,...);
}
barrier( CLK_GLOBAL_MEM_FENCE);
}
}

Is this related to use Semaphores?

1

1 Answers

1
votes

This is a common OpenCL misunderstanding. The barrier statement is only within a work group, not the global work size. There is no statement for global synchronization (because of how work groups are executed; some run to completion before others even start). The solution for global synchronization is to use separate kernels. The first will run to completion, and then the second one will.