29
votes

I have hundreds of really larges matrices, like (600, 800) or (3, 600, 800) shape'd ones.

Therefore I want to de-allocate the memory used as soon as I don't really need something anymore.

I thought:

some_matrix = None

Should do the job, or is just the reference set to None but somewhere in the Memory the space still allocated? (like preserving the allocated space for some re-initialization of some_matrix in the future)

Additionally: sometimes I am slicing through the matrices, calculated something and put the values into a buffer (a list, because it gets appended all the time). So setting a list to None will definitely free the memory, right?

Or does some kind of unset() method exist where whole identifiers plus its referenced objects are "deleted"?

3

3 Answers

31
votes

You definitely want to have a look at the garbage collection. Unlike some programming language like C/C++ where the programmer has to free dynamically allocated memory by himself when the space is no longer needed, python has a garbage collection. Meaning that python itself frees the memory when necessary.

When you use some_matrix = None, you unlink the variable from the memory space; the reference counter is decreased, and if it reaches 0, the garbage collector will free the memory. When you use del some_matrix as suggested by MSeifert, the memory is not freed immediately as opposed to what the answer says. According to python doc, this is what happens:

Deletion of a name removes the binding of that name from the local or global namespace

What happened under the hood is that the counter of references to the memory space is reduced by 1 independently of assigning None or using del. When this counter reaches 0, the garbage collector will free the memory space in the future. The only difference is that when using del, it is clear from the context that you do not need the name anymore.

If you look at the doc of the garbage collection, you will see that you can invoke it by yourself or change some of its parameters.

16
votes

Numpy deletes arrays when the reference counter is zero (or at least it keeps track of the reference counter and let's the OS collect the garbage).

For example having

import numpy as np
a = np.linspace(0,100, 10000000)
a = None

will free the memory "immediatly" (preferred way is writing del a though) while

import numpy as np
a = np.linspace(0,100, 10000000)
b = a
a = None

will free nothing.


You mentioned also slicing. Slicing is just a view on the data and therefore exactly like the second example. If you don't delete both variables that reference the same array the OS will keep the arrays.

If I do something very memory expensive I'll always stick with seperate functions that do the operation and only return what is really necessary. Functions clean up after themselves so any intermediate results are freed (If they are not returned).

0
votes

In case you have to do something like below memory won't be freed although a copy of a will be made implicitly:

a = np.ones((10000, 10000))
b = np.empty((10000, 10000))
b[:] = a
a = None
del a

Instead you can do the following and memory will be freed after doing a = None:

a = np.ones((10000, 10000))
b = np.empty((10000, 10000))
b[:] = np.copy(a)
a = None
del a