python - Clearing Tensorflow GPU memory after model execution

Question

I've trained 3 models and am now running code that loads each of the 3 checkpoints in sequence and runs predictions using them. I'm using the GPU.

When the first model is loaded it pre-allocates the entire GPU memory (which I want for working through the first batch of data). But it doesn't unload memory when it's finished. When the second model is loaded, using both tf.reset_default_graph() and with tf.Graph().as_default() the GPU memory still is fully consumed from the first model, and the second model is then starved of memory.

Is there a way to resolve this, other than using Python subprocesses or multiprocessing to work around the problem (the only solution I've found on via google searches)?

What if you delete the session (del sess)? That should have the same effect on memory as restarting process — Yaroslav Bulatov
Shouldn't sess.close() (or using the Session as a context with with) also work? — etarion
I wish, I do use with ... sess: and have also tried sess.close(). GPU memory doesn't get cleared, and clearing the default graph and rebuilding it certainly doesn't appear to work. That is, even if I put 10 sec pause in between models I don't see memory on the GPU clear with nvidia-smi. That doesn't necessarily mean that tensorflow isn't handling things properly behind the scenes and just keeping its allocation of memory constant. But I'm having troubles validating that line of reasoning. — David Parks
nvidia-smi doesn't correctly report amount of memory available to TensorFlow. When TensorFlow computation releases memory, it will still show up as reserved to outside tools, but this memory is available to other computations in tensorflow — Yaroslav Bulatov
@YaroslavBulatov I've done more testing and confirmed that tensorflow is performing as expected on the 2nd and 3rd models after simply resetting the default graph. If you post that as an answer I'll accept it as correct. It seems that this question is irrelevant, though probably commonly asked so worth keeping open. — David Parks

Oliver Wilken Oliver Wilken · Accepted Answer · 2017-06-30T08:30:35

A git issue from June 2016 (https://github.com/tensorflow/tensorflow/issues/1727) indicates that there is the following problem:

currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down.

Thus the only workaround would be to use processes and shut them down after the computation.

Example Code:

import tensorflow as tf
import multiprocessing
import numpy as np

def run_tensorflow():

    n_input = 10000
    n_classes = 1000

    # Create model
    def multilayer_perceptron(x, weight):
        # Hidden layer with RELU activation
        layer_1 = tf.matmul(x, weight)
        return layer_1

    # Store layers weight & bias
    weights = tf.Variable(tf.random_normal([n_input, n_classes]))


    x = tf.placeholder("float", [None, n_input])
    y = tf.placeholder("float", [None, n_classes])
    pred = multilayer_perceptron(x, weights)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

    init = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init)

        for i in range(100):
            batch_x = np.random.rand(10, 10000)
            batch_y = np.random.rand(10, 1000)
            sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})

    print "finished doing stuff with tensorflow!"


if __name__ == "__main__":

    # option 1: execute code with extra process
    p = multiprocessing.Process(target=run_tensorflow)
    p.start()
    p.join()

    # wait until user presses enter key
    raw_input()

    # option 2: just execute the function
    run_tensorflow()

    # wait until user presses enter key
    raw_input()

So if you would call the function run_tensorflow() within a process you created and shut the process down (option 1), the memory is freed. If you just run run_tensorflow() (option 2) the memory is not freed after the function call.

python - Clearing Tensorflow GPU memory after model execution

6 Answers