With the CUDA, is it possible to use like a garbage-collection?
For example, when I got an out-of-memory error from cudaMalloc(...), can I free the previously allocated data and retry allocating memory?
Once cudaMalloc(...) returns out-of-memory,
the following cuda calls seem to return the out-of-memory after then.
Even when I call cudaFree with the valid device pointer allocated previously, cudaFree returns out-of-memory...
cudaDeviceReset() is not a good way to recover the state for my case.