0
votes

Here are sample codes:

__kernel void my_kernel(__global float* src,
                        __global float* dst){

    float4 a = vload4(0,src);
    //do something to a
    ...
    vstore4(a,0,dst)

}

According to OpenCL 1.2 Reference, address of global buffer src and dst must be 4-bytes aligned when using vloadn and vstoren, or the results are undefined. My question is whether OpenCL will automate aligning the global device address after completing the call to clCreateBuffer? If not, how to ensure proper alignment?(in addition, how about local memory object?)

2

2 Answers

0
votes

Refer to Data Type of OpenCL. The OpenCL compiler is responsible for aligning data items to the appropriate alignment as required by the data type. So I think the answer is basically yes.

0
votes

Buffers are surely aligned to a boundary bigger than 4 bytes, except you are using CL_MEM_USE_HOST_PTR.

By the way: In your code it could be better to declare the parameters as float4* instead of using vload4 and vstore4.