0
votes

I was trying to bind a host memory that was mapped for zero-copy to a texture, but it looks like it isn't possible.

Here is a code sample:

float* a;
float* d_a;
cudaSetDeviceFlags(cudaDeviceMapHost);
cudaHostAlloc( (void **)&a, bytes, cudaHostAllocMapped);
cudaHostGetDevicePointer((void **)&d_a,  (void *)a, 0);

texture<float, 2, cudaReadModeElementType> tex;
cudaBindTexture2D( 0, &tex, d_a, &channelDesc, width, height, pitch);

Is it recommended that you used pinned memory and just copy it over to device memory that is bind to texture?

1
Is texture defined globally? also use cudaHostAllocWriteCombined if you are reading the texture from host memoryfabrizioM
Yes, it is defined globally. cudaHostAllocWriteCombined just makes the read on the device side more efficient by avoiding cache.sjchoi

1 Answers

1
votes

It is possible, but you have to make sure the pitch is correctly aligned - at least 64B granularity. I do not see an alignment requirement in cudaDeviceProp that you can use. cudaDeviceProp::textureAlignment will give you decent guidance - that is the alignment requirement for the base address of the texture, not the pitch; but I believe that alignment requirement is stricter than the pitch alignment requirement.

Unfortunately there is no cudaHostAllocPitch() to take care of this for you.

Fair warning: I have done quite a bit of directed performance testing of 1D texture-from-host-memory, and it is s-l-o-w. Tesla class hardware goes at 2G/s and Fermi class hardware at 0.5 G/s. I have no reason to believe 2D texturing will be any faster.