I am using the CUDA API / cuFFT API. In order to move data from host to GPU I am usign the cudaMemcpy functions. I am using it like below. len is the amount of elements on dataReal and dataImag.
void foo(const double* dataReal, const double* dataImag, size_t len)
{
cufftDoubleComplex* inputData;
size_t allocSizeInput = sizeof(cufftDoubleComplex)*len;
cudaError_t allocResult = cudaMalloc((void**)&inputData, allocSizeInput);
if (allocResult != cudaSuccess) return;
cudaError_t copyResult;
coypResult = cudaMemcpy2D(static_cast<void*>(inputData),
2 * sizeof (double),
static_cast<const void*>(dataReal),
sizeof(double),
sizeof(double),
len,
cudaMemcpyHostToDevice);
coypResult &= cudaMemcpy2D(static_cast<void*>(inputData) + sizeof(double),
2 * sizeof (double),
static_cast<const void*>(dataImag),
sizeof(double),
sizeof(double),
len,
cudaMemcpyHostToDevice);
//and so on.
}
I am aware, that pointer arithmetic on void pointers is actually not possible. the second cudaMemcpy2D does still work though. I still get a warning by the compiler, but it works correctly.
I tried using static_cast< char* > but that doesn't work as cuffDoubleComplex* cannot be static casted to char*.
I am a bit confused why the second cudaMemcpy with the pointer arithmetic on void is working, as I understand it shouldn't. Is the compiler implicitly assuming that the datatype behind void* is one byte long?
Should I change something there? Use a reinterpret_cast< char* >(inputData) for example?
Also during the allocation I am using the old C-style (void**) cast. I do this because I am getting a "invalid static_cast from cufftDoubleComplex** to void**". Is there another way to do this correctly?
static_cast<void*>(&(inputData->y))
(instead of+ ...
) and usesizeof(cufftDoubleComplex)
instead of2 * sizeof(cufftDoubleComplex)
(even it is the same value, first one is more generic). – HoltcudaMalloc
does not require that you cast tovoid **
and niether doescudaMemcpy2D
require you to cast tovoid *
. – Robert Crovella&(double *)
) you have computed, tocudaMalloc
. Likewise forcudaMemcpy
(i.e.double *
) Even if you were going to use a cast (again, unnecessary) you should do all your pointer arithmetic first, in whatever type is relevant (e.g.double *
) then cast as the final step. This would completely avoid any pointer arithmetic usingvoid *
. – Robert Crovella