Why pytorch needs much more memory than it should?

Question

I'm just playing around with pytorch and I'm wondering why it consumes so much memory of my GPU?

I'm using Cuda 10.0 with pythorch 1.2.0 and torchvision 0.4.0.

import torch
gpu = torch.device("cuda")
x = torch.ones(int(4e8), device=gpu)
y = torch.ones(int(1e5), device=gpu)

Running the above code I get the error: RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 2.00 GiB total capacity; 1.49 GiB already allocated; 0 bytes free; 0 bytes cached)

So, does pytorch needs ~500MB of the gpu memory as overhead? Or what is the problem here?

Physicing Physicing · Accepted Answer · 2019-09-28T22:40:20

More information and testing done by xymeng in github could be seen in the given link

Referencing xymeng's words :

PyTorch has its own cuda kernels. From my measurement the cuda runtime allocates ~1GB memory for them. If you compile pytorch with cudnn enabled the total memory usage is 1GB + 750M + others = 2GB+ Note that this is just my speculation as there is no official documentation about this. What puzzles me is that the cuda runtime allocates much more memory than the actual code size (they are approx. linearly correlated. If I remove half of pytorch's kernels the memory usage is also reduced by half). I suspect either the kernel binaries have been compressed or they have to be post-processed by the runtime.

Seems it suits your situation.

Why pytorch needs much more memory than it should?

1 Answers