Once cudaMalloc returns out of memory, every cuda API call returns failure

Question

With the CUDA, is it possible to use like a garbage-collection?

For example, when I got an out-of-memory error from cudaMalloc(...), can I free the previously allocated data and retry allocating memory?

Once cudaMalloc(...) returns out-of-memory, the following cuda calls seem to return the out-of-memory after then. Even when I call cudaFree with the valid device pointer allocated previously, cudaFree returns out-of-memory...

cudaDeviceReset() is not a good way to recover the state for my case.

When you say (in the title) "every cuda API call returns failure", do you actually "every cudaMalloc call returns failure" (ie. is this question really "cudaFree doesn't seem to free memory", or is it something else)? — talonmies

Avi Ginsburg Avi Ginsburg · Accepted Answer · 2015-06-18T09:20:06

Once CUDA encounters an error, all API calls will return that error. If the error corrupts the CUDA context, there's not much to do except reset the device (cudaDeviceReset). If the CUDA context has not been corrupted then the state can be reset to cudaSuccess by calling cudaGetLastError().

As per Robert Crovella's comment, a failed cudaMalloc probably does not corrupt the CUDA context and therefore you should be able to recover. This is not necessarily true of other causes of an error, and each case may be different.

Once cudaMalloc returns out of memory, every cuda API call returns failure

1 Answers