Why Keras does not see my GPU while TensorFlow does?

Question

Following an answer from SO, I have run:

# confirm TensorFlow sees the GPU
from tensorflow.python.client import device_lib
assert 'GPU' in str(device_lib.list_local_devices())

# confirm Keras sees the GPU
from keras import backend
assert len(backend.tensorflow_backend._get_available_gpus()) > 0

# confirm PyTorch sees the GPU
from torch import cuda
assert cuda.is_available()
assert cuda.device_count() > 0
print(cuda.get_device_name(cuda.current_device()))

The first test is working, while the other ones do not.

Running nvcc --version gives:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

And nvidia-smi also work.

list_local_devices() provides:

[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 459307207819325532, name: "/device:XLA_GPU:0" device_type: "XLA_GPU" memory_limit: 17179869184 locality { } incarnation: 9054555249843627113 physical_device_desc: "device: XLA_GPU device", name: "/device:XLA_CPU:0" device_type: "XLA_CPU" memory_limit: 17179869184 locality { } incarnation: 5902450771458744885 physical_device_desc: "device: XLA_CPU device"]

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) returns:

Device mapping: /job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device /job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device

Why are Keras and PyTorch unable to run on my GPU? (RTX 2070)

actually it does not work either with tf tf.test.is_gpu_available() returns False — guhur
oh ok, if it doesnt work with tensorflow either then you need to install tensorflow for gpu. it involves more steps than just a pip install. — Paritosh Singh

guhur guhur · Accepted Answer · 2018-12-10T09:42:08

I had a hard time to find the issue. Actually, running CUDA samples provided me great insights:

CUDA error at ../../common/inc/helper_cuda.h:1162 code=30(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)"

While with sudo: MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM GPU Device 0: "GeForce RTX 2070" with compute capability 7.5

So the issue was that my lib were not readable for everyone.

My bug was fixed with:

sudo chmod -R a+r /usr/local/cuda*

Why Keras does not see my GPU while TensorFlow does?

2 Answers