sycl/dpc++ accessor vs. global_ptr in kernel function object

Question

With the following toy code using Intel OneAPI beta6.

#include <CL/sycl.hpp>
#include <iostream>

namespace sycl = cl::sycl;

const int SIZE=1;

class Increment_accessor {
  public:
    Increment_accessor(sycl::accessor<int, 1, sycl::access::mode::read_write, sycl::access::target::global_buffer> ptr_) : ptr {ptr_} {}
    void operator()(sycl::item<1> item) {
      ptr[item.get_linear_id()]++;
    }
  private:
    sycl::accessor<int, 1, sycl::access::mode::read_write, sycl::access::target::global_buffer> ptr;
};

class Increment_pointer {
  public:
    Increment_pointer(sycl::global_ptr<int> ptr_) : ptr {ptr_} {} 
    void operator()(sycl::item<1> item) {
      ptr[item.get_linear_id()]++;
    }
  private:
    sycl::global_ptr<int> ptr;
};

int 
main(int argc, char *argv[])
{
  sycl::device dev = sycl::default_selector().select_device();
  sycl::queue q(dev);
  int hbuffer[SIZE] = {};

  {
    sycl::buffer<int, 1> hbuf(hbuffer, sycl::range<1> {SIZE});
    q.submit([&](sycl::handler& cgh) {
        auto harray = hbuf.get_access<sycl::access::mode::read_write, sycl::access::target::global_buffer>(cgh);
        // !!! Uncomment _one_ of the following lines to compile !!!
        //Increment_accessor increment {harray};
        //Increment_pointer increment {harray};
        //Increment_pointer increment {harray.get_pointer()};
        cgh.parallel_for<class kernel1>(
            sycl::range<1> {SIZE}, 
            increment
        );
      }
      ); 
  }

  for (int i=0; i<SIZE; i++) std::cout << "hbuffer[" << i << "]= " << hbuffer[i] << std::endl;
}

Question: why are the Increment_pointer versions "wrong"? There's no compile/runtime error. You just don't get the incremented hbuffer at the end. (I've played with some similar versions where the ptr in operator() ends up being 0x0).

I'm still learning to think in "SYCL" so elaborating explanation is welcome.

This code you have provided doesn't compile, increment is not defined and I'm not sure what you are trying to do with it I'm afraid. Have you taken a look at some of the SYCL samples we provide on our GitHub repo? github.com/codeplaysoftware/computecpp-sdk/tree/master/samples — Rod Burns
Sorry I wasn't clear. You need to uncomment one of the lines in the source to get the right function object. I've edited the code to make that more clear. — justapony

illuhad illuhad · Accepted Answer · 2020-05-12T13:46:31

If I understood correctly, you are asking why your code works when using Increment_accessor but breaks when using Increment_pointer. Or, phrased more generally, can a kernel function be built that that accepts pointers as arguments instead of accessors?

The SYCL spec is not super clear on this, but section 4.7.6.3 gives a hint:

A SYCL accessor can be a device accessor in which case it provides access to data within a SYCL kernel function, or a host accessor in which case it provides immediate access on the host. If an accessor has the access target access::target::global_buffer, access::target::constant_buffer, access::target::local,access::target::image or access::target::image_array then it is considered a device accessor, and therefore can only be used within a SYCL kernel function

So, device accessors are not valid on the host. Now, in your pointer-based versions, you are calling get_pointer() (or rely on an implicit conversion from accessor to pointer, which will be equivalent). But at that point, you are still inside the command group scope, not inside the kernel, i.e. the code inside the parallel_for increment kernel. The command group scope is always evaluated on the host in SYCL because the accessors constructed there tell the SYCL runtime how to build the task graph and which dependencies nodes in the SYCL task graph have.

Therefore we can boil down the question to whether calling get_pointer() already qualifies as using a device accessor. If it does, it means that a device accessor is used outside a SYCL kernel which violates the cited section of the specification and would make this code illegal.

I would argue that calling get_pointer() already corresponds to "using an accessor". get_pointer() requires for example that the accessor already has a valid device memory allocation that it can point to. But, in command group scope it is possible that this allocation doesn't even exist yet since an efficient SYCL runtime might only do necessary allocations in device memory lazily in the background right before they are needed. But during command group evaluation, the command group is not even fully submitted yet to the SYCL runtime, as it is, in fact, still in the stage of being submitted. Therefore, a SYCL implementation will in general not be able to guarantee that get_pointer() already works at this stage.

It is therefore best to assume that device accessors on the host are merely a description of the accessed data and only gain more meaning as a mechanism that actually allows data accesses when used inside kernels.

Now, to the more general question whether it is possible to have kernels accept pointers as arguments instead of accessors: It is not possible in SYCL 1.2.1 in my understanding to have accessors and somehow convert them to pointers on the host because of the issues described above. But, if you don't use accessors at all, you can use the Intel unified shared memory (USM) extension. This extension is not yet in regular SYCL 1.2.1, but it is available in Intel oneAPI DPC++. USM allows you to explicitly create device-accessible allocations that are managed using pointers. You can then use those pointers directly in your kernels.

Update: USM has been merged into the official SYCL 2020 standard which was released in Feb 2021, and is now available in multiple SYCL implementations apart from DPC++.

sycl/dpc++ accessor vs. global_ptr in kernel function object

1 Answers