I have an opencl kernel that computes two global buffers in two loops. The first loop does some computations with a global thread and writes the result to the output buffer "OutBuff". Then the second loop updates the values of the global buffer "UpdateBuff" according to the results computed in "OutBuff" in the first loop(on the previous level). The prolem is that the global thread between the two loops changed since the threads are executed in parallel. But in my case, I need to keep the order of thread execution between these two loops. I need to compute the two loops with the same global id. for example
__kernel void globalSynch(__global double4* input,__global uint *points,__global double4* OutBuff,__global double4* UpdateBuff)
{
int gid = get_global_id(0);
uint pt;
for(int level=0;level<N;level++)
{
for(int i=0;i<blocksize;i++)
{
pt== points[gid*i*level];
OutBuff[pt]= do_some_computations(UpdateBuff,....);
}
barrier( CLK_GLOBAL_MEM_FENCE);
for(int j=0;j<blocksize1;j++)
{
pt=points[gid*j*(level+1)];
UpdateBuff[pt]= do_some_computations(OutBuff,...);
}
barrier( CLK_GLOBAL_MEM_FENCE);
}
}
Is this related to use Semaphores?