0
votes

I am installing latest Tensorflow library in my ubuntu 16.04 machine. For this I downloaded and Installed Latest Cuda toolkits and Cuda nn libraries.

After Installation I checked it out using following commands.

(/home/naseer/anaconda2/) naseer@naseer-Virtual-Machine:~/anaconda2$ python
Python 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 20 2016, 23:09:15) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:102] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH: /usr/local/cuda-8.0.61/lib64
I tensorflow/stream_executor/cuda/cuda_dnn.cc:2259] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally

What does the above output mean? does it mean that Tensorflow will correctly run on my Nvidia GPU enabled system or do I need to do something else?

My local Directory Structure:

I have added following screen shot that shows various library path in my local directories.

enter image description here

My Understanding

I have feeling that it is trying to open CUDA library in the path /usr/local/cuda-8.0.61/lib64 when infact there are paths of /usr/local/cuda-8.0/lib64 and /usr/local/cuda/lib64. Itried to rename that path but still could not work?

Updates (Conflicting Directory Structure)

enter image description here

2
I have installed cuda nn and gave path of it as done in the link you suggested but it did not work.Naseer
@hbaderts I have added more specific descriptions that address my unique issue. I hope it might also help others.Naseer
What is the output of echo $LD_LIBRARY_PATH?hbaderts
sorry @hbaderts my lab is now off. I would tell you tomorrow now. thanksNaseer

2 Answers

1
votes

To run TensorFlow, you have to install cuDNN. There are two possible ways:

1. Installing cuDNN for all Users:

This is the way that the official TensorFlow documentation describes. Here, cuDNN is installed into the folder /usr/local/cuda. That way, cuDNN can be used by all users on that machine. The instructions are taken from the TensorFlow documentation:

  1. Download the correct cuDNN version. For TensorFlow r1.1, that would be cuDNN v5.1 for CUDA 8.0.
  2. Unpack the .tgz file. Open a terminal, navigate to the folder where you downloaded cuDNN, and call

    tar xvzf cudnn-8.0-linux-x64-v5.1-ga.tgz
    

    Note: this is just an example, check the file name before calling this.

    This will create a new folder called cuda, which contains two subfolders include and lib64, containing all cuDNN files.

  3. Move the downloaded files to /usr/local/cuda. You will need sudo rights for this!

    sudo cp cuda/include/cudnn.h /usr/local/cuda/include
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
    

And that's already it. TensorFlow should now work as expected.

2. Installing cuDNN locally:

If you do not have admin rights, or you want to have different cuDNN versions on your machine, you can install cuDNN to any folder of your choice, and then set the paths correctly. This method is proposed in this answer on StackOverflow and is explained in the official NVIDIA installation instructions.

Step 1 and 2 are the same as above.

  1. Move the extracted cuda folder to the place you choose.
  2. Add this directory to the $LD_LIBRARY_PATH environment variable. In a terminal, you can do this by calling

    export LD_LIBRARY_PATH=/path/to/cudnn/lib64:$LD_LIBRARY_PATH
    

    where /path/to/cudnn is the place where you moved cuDNN in the previous step. Note the lib64 at the end!

    Usually, you'll have to call this every time before starting TensorFlow. To avoid this, you can edit the file ~/.bashrc and add this line at the bottom of the file. This will automatically add cuDNN to the path every time you start a terminal window.

With that, TensorFlow will be able to find cuDNN and work as expected.

1
votes

To run a GPU enabled TensorFlow 1.4 you should first install CUDA 8 (+patch 2) and cuDNN v6.0, you may find this step-by-step installation guide useful.

After installing the CUDA 8 drivers you will need to install cuDNN v6.0:

Download the cuDNN v6.0 driver. The driver can be downloader from here, please note that you will need to register first.

Copy the driver to the remote machine (scp -r -i ...)

Extract the cuDNN files, copy them to the target directory and extract the files from the .tgz file:

tar xvzf cudnn-8.0-linux-x64-v6.0.tgz

sudo cp -P cuda/include/cudnn.h /usr/local/cuda/includesudo

cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64

sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

Update your bash file

nano ~/.bashrc

Add the following lines to the end of the bash file:

export CUDA_HOME=/usr/local/cuda export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH export PATH=${CUDA_HOME}/bin:${PATH}

Install the libcupti-dev library

sudo apt-get install libcupti-dev

Install pip

sudo apt-get install python-pip

sudo pip install –upgrade pip

Install TensorFlow

sudo pip install tensorflow-gpu

Test the installation, by running the following within the Python command line:

from tensorflow.python.client import device_lib

def get_available_gpus():

  • local_device_protos = device_lib.list_local_devices()

  • return [x.name for x in local_device_protos if x.device_type == ‘GPU’]

get_available_gpus()

For a single GPU the output should be similar to:

2017-11-22 03:18:15.187419: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

2017-11-22 03:18:17.986516: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2017-11-22 03:18:17.986867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:

name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235

pciBusID: 0000:00:1e.0

totalMemory: 11.17GiB freeMemory: 11.10GiB

2017-11-22 03:18:17.986896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)

[u’/device:GPU:0′]