I'm trying to compile my CUDA C code for a GPU with sm_10 architecture which does not support invoking malloc from __global__ functions.
I need to keep a tree for which the nodes are created dynamically in the GPU memory. Unfortunately, without malloc apparently I can't do that.
Is there is a way to copy an entire tree using cudaMalloc? I think that such an approach will just copy the root of my tree.
cudaMalloc?cudaMallocis used only to allocate memory. Could you also explain why do you expect that bycudaMallocyou will only be able to copy the root of your tree? - Vitality