0
votes

There are several strings like

std::string first, second, third; ...

My plan was to collect their addresses into a char* array:

char *addresses = {&first[0], &second[0], &third[0]} ...

and pass the char **addresses to the OpenCL kernel.

There are several problems or questions:

The main issue is that I cannot pass array of pointers.

Is there any good way to use many-many strings from the kernel code without copying them but leave them in the shared memory?

I'm using NVIDIA on Windows. So, I can use only OpenCL 1.2 version.

I cannot concatenate the string because those are from different structure...

EDIT:

According to the first answer, if I have this (example):

char *p;

cl_mem cmHostString = clCreateBuffer(myDev.getcxGPUContext(), CL_MEM_ALLOC_HOST_PTR, BUFFER_SIZE, NULL, &oclErr);

oclErr = clEnqueueWriteBuffer(myDev.getCqCommandQueue(), cmHostString, CL_TRUE, 0, BUFFER_SIZE, p, 0, NULL, NULL);

Do I need copy the each element of my char array from host memory to other part of the host memory (and the new address is hidden from the host)?? It is not logical me. Why cannot I use the same address? I could directly access the host memory from the GPU device and use it.

1
The std::string keeps its contents on the heap and uses internal references, i.e. the internal pointer to data may very well point to another instance of a string until you start modifying it. I don't see why you would like to do this. You can pass array of pointers, but you need to be carefull about where they point to. - Jens Munk
thx. I cannot pass (__global char **myWords) to the kernel in 1.2. I even cannot compile - user3993078
Don't have a working OpenCL setup at hand, but on several occasions I have used __local float* input[2], see e.g. stackoverflow.com/questions/11978024/…. You can always use a single pointer and re-establish row-pointers in the kernel. It gets a little messy, if the strings have different lenghts though - Jens Munk
I'm not sure I understand you. But I think I have different address in the kernel in case of 1.2. This means I cannot use [] operator without copying the whole array, cannot me? - user3993078
If you know length of all the strings, you can keep the chars in one big one-dimensional array and inside the kernel create row pointers such that you get an array of pointers, char* input[NSTRINGS], input[i] = largeArray[i*STRING_LENGTH]. It is easier than working with offsets over and over. - Jens Munk

1 Answers

0
votes

Is there any good way to use many-many strings from the kernel code without copying them but leave them in the shared memory?

Not in OpenCL1.2. Shared Virtual Memory concept is available since OpenCL 2.0 which isn't supported by NVidia as yet. You will need to either switch to GPU that supports OpenCL 2.0 or for OpenCL 1.2 copy your strings into continuous array of characters and pass them (copy) to the kernel.


EDIT: Responding to your edit - you can use:

  • CL_MEM_ALLOC_HOST_PTR flag to create empty buffer of required size and then map that buffer using clEnqueueMapBuffer and fill it using the pointer returned from mapping. After that unmap the buffer using clEnqueueUnmapMemObject.
  • CL_MEM_USE_HOST_PTR flag to create buffer of required size and pass there pointer to your array of characters.

From my experience buffer created using CL_MEM_USE_HOST_PTR flag is usually slightly faster, I think whether data is really copied or not under the hood depends on the implementation. But to use that you need to have your array of characters first prepared on the host.

You basically need to benchmark and see what is faster. Also don't concentrate too much on data copying, these are usually tiny numbers (transfers in GB/sec) in compare to how long it takes to run the kernel (depends of course what's in the kernel).