GPU errors when running tensorflow AI

Question

I'm following a beginner's TensorFlow tutorial and trying out classification. There are a bunch of GPU errors. I have cuda tools installed as well as my latest GPU drivers. Here is the output:

2021-01-13 15:42:24.186914: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-01-13 15:42:24.187065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. [NumericColumn(key='SepalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='SepalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)] 2021-01-13 15:42:26.282013: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll 2021-01-13 15:42:26.302224: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties: pciBusID: 0000:0e:00.0 name: GeForce GTX 1080 computeCapability: 6.1 coreClock: 1.86GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s 2021-01-13 15:42:26.302958: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-01-13 15:42:26.303513: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found 2021-01-13 15:42:26.304062: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found starting training 2021-01-13 15:42:26.307161: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-01-13 15:42:26.308219: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-01-13 15:42:26.312354: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll 2021-01-13 15:42:26.312941: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found 2021-01-13 15:42:26.313499: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-01-13 15:42:26.313623: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-01-13 15:42:26.314323: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-01-13 15:42:26.315481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1300] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-01-13 15:42:26.315604: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306]
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\levig\AppData\Local\Temp\tmpbmbc3as1 WARNING:tensorflow:From C:\Users\levig\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\training\training_util.py:235: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. WARNING:tensorflow:From C:\Users\levig\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\optimizer_v2\adagrad.py:82: calling Constant.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor 2021-01-13 15:42:27.410575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties: pciBusID: 0000:0e:00.0 name: GeForce GTX 1080 computeCapability: 6.1 coreClock: 1.86GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s 2021-01-13 15:42:27.410786: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-01-13 15:42:27.474456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1300] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-01-13 15:42:27.474571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0 2021-01-13 15:42:27.474637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1319] 0: N 2021-01-13 15:42:27.482654: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:258] None of the MLIR optimization passes are enabled (registered 0 passes)

Here is my code:

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf

import pandas as pd
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']
train_path = tf.keras.utils.get_file(
    "iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_path = tf.keras.utils.get_file(
    "iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")

train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
# Here we use keras (a module inside of TensorFlow) to grab our datasets and read them into a pandas dataframe
train_y = train.pop('Species')
test_y = test.pop('Species')
train.head() # the species column is now gone


def input_fn(features, labels, training=True, batch_size=256):
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle and repeat if you are in training mode.
    if training:
        dataset = dataset.shuffle(1000).repeat()

    return dataset.batch(batch_size)
# Feature columns describe how to use the input.
my_feature_columns = []
for key in train.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))

# Build a DNN with 2 hidden layers with 30 and 10 hidden nodes each.
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    # Two hidden layers of 30 and 10 nodes respectively.
    hidden_units=[30, 10],
    # The model must choose between 3 classes.
    n_classes=3)

print("starting training")

classifier.train(
    input_fn=lambda: input_fn(train, train_y, training=True),
    steps=5000)

As the error says: Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices. In particular: Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll. Is it in your path ? Is it provided with your TF library ? You may have to install cuda, and create symlinks to cudart64_110.dll. — Soleil
Please don't add answers to your questions. I have rolled back/edited your question. Please write a new answer instead. — Sabito 錆兎

Unknown Unknown · Accepted Answer · 2021-06-15T06:27:25

From comments

Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the GPU Support guide for how to download and setup the required libraries for your platform. (paraphrased from Soleil)

GPU errors when running tensorflow AI

1 Answers