0
votes

I have a simple script as follows:

import torch
LOSS_WEIGHTS = [1,2,3]
LOSS_WEIGHTS = torch.Tensor(LOSS_WEIGHTS)
LOSS_WEIGHTS = LOSS_WEIGHTS.to(0)

If I start the script while the computer is idle, I often get “CUDA error: out of memory” yet the GPU is completely empty.

The error somehow always goes away after I repeatedly relaunch the script several times. Does anyone know what can I do to prevent this error? Am I suppose to initialise my cuda device before starting the script?

  • pytorch 1.2.0 (Tried several versions)
  • cuda 10.1 (Also tried cuda 9)
  • python 3.7
  • Nvidia Driver 430
  • Hardware: 1 x GTX 1070
  • Ubuntu 18.04
1
it sounds like there is an initialization with cuda and your script is trying to run a process before that is fully initialized. Maybe use with to check for availablity? pytorch.org/docs/stable/notes/cuda.htmleatmeimadanish
Thanks! Adding with torch.cuda.device(0): with a block of indented code seems to work get rid of the error completely. This seems to complicate the code a bit since i want the code to sometimes run with CPU on another PC for debugging.matohak

1 Answers

0
votes

Try initializing like this (assuming you have only 1 GPU)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Also you can check gpu if it is available and online using a command like

nvidia-smi