0
votes

I obtain a segmentation fault after the second cudaMalloc.

#include <cuda.h>
#include <cuda_runtime.h>

int main(){

  int n=16;

  float2* a;
  cudaMalloc((void **) a, n*sizeof(float2));
  float2* b;
  cudaMalloc((void **) b, n*sizeof(float2));

  return 0;
}

However, if I comment out any of the 2 cudaMallocs, the code runs fine.

Thanks!

2
look at Use of cudamalloc(). Why the double pointer? if the answer is not enough for youbruno
@bruno Yep, one of the answers is the correct one: "This is simply a horrible, horrible API design".Lundin

2 Answers

6
votes

You have to pass a pointer to the pointer like this:

float2* a;
cudaMalloc(&a, n*sizeof(float2));
float2* b;
cudaMalloc(&b, n*sizeof(float2));

otherwise, you just cast a dangling pointer to a "pointer to pointer" and the library dereferences a garbage address leading to a segfault.

-1
votes

Because of the broken CUDA API, the correct answer is to write a wrapper around their trash:

void* saneMalloc (size_t n)
{
  void* tmp;
  if (cudaMalloc(&tmp, n) == cudaSuccess)
    return tmp;
  return NULL;
}

...

float* a = saneMalloc(n);

You have to do this because the only generic pointer type in C is void*. You can convert from any pointer-to-type to void*, but that does not apply to void**. So if you have a float, you cannot pass on float** to a function expecting a void**. This is an incompatible pointer type.

Specifically, when passing parameters to function, they are copied as per the rules of simple assignment (C17 6.5.16.1). Passing a float** to a function expecting a void** is a constraint violation of the simple assignment rule. The code is not allowed to compile cleanly, as it is a C standard violation.