I try to train my models with multi-GPUS. So I run the cifar10_multi_gpu.py (https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py) .
1. My location:
OS Platform : Linux version 3.10.0-327.el7.x86_64
TensorFlow installed : pip install --upgrade ./tensorflow_gpu-1.0.0rc0-cp35-cp35m-linux_x86_64.whl
Python version: Python 3.5.2
CUDA/cuDNN version: cuda_8.0.61_375.26_linux.run / cudnn-8.0-linux-x64-v5.1.tgz
2. GPU setup is correct
import tensorflow as tf
with tf.device('/cpu:0'):
a = tf.constant([1.0, 2.0, 3.0], shape=[3], name='a') b = tf.constant([1.0, 2.0, 3.0], shape=[3], name='b')
with tf.device('/gpu:1'):
c = a + b
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
sess.run(c)
add: (Add): /job:localhost/replica:0/task:0/gpu:1 I
tensorflow/core/common_runtime/simple_placer.cc:841] add: (Add)/job:localhost/replica:0/task:0/gpu:1 b: (Const): /job:localhost/replica:0/task:0/cpu:0 I
tensorflow/core/common_runtime/simple_placer.cc:841] b: (Const)/job:localhost/replica:0/task:0/cpu:0 a: (Const): /job:localhost/replica:0/task:0/cpu:0 I
tensorflow/core/common_runtime/simple_placer.cc:841] a: (Const)/job:localhost/replica:0/task:0/cpu:0
array([ 2., 4., 6.], dtype=float32)
3. InvalidArgumentError: python cifar10_multi_gpu.py
I tensorflow/core/common_runtime/simple_placer.cc:669] Ignoring device specification /GPU:0 for node 'tower_0/fifo_queue_Dequeue' because the input edge from 'prefetch_queue/fifo_queue' is a reference connection and already has a device field set to /CPU:0
Traceback (most recent call last): File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call return fn(*args)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1000, in _run_fn self._extend_graph()
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1049, in _extend_graph self._session, graph_def.SerializeToString(), status)
File "/home/xx/anaconda3/lib/python3.5/contextlib.py", line 66, in exit next(self.gen)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device to node 'tower_0/softmax_linear/weight_loss_1': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
[[Node: tower_0/softmax_linear/weight_loss_1 = ScalarSummary[T=DT_FLOAT, _device="/device:GPU:0"](tower_0/softmax_linear/weight_loss_1/tags, tower_0/softmax_linear/weight_loss)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "cifar10_multi_gpu_train.py", line 280, in tf.app.run() File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "cifar10_multi_gpu_train.py", line 276, in main train()
File "cifar10_multi_gpu_train.py", line 237, in train sess.run(init)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 767, in run run_metadata_ptr)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 965, in _run feed_dict_string, options, run_metadata)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run target_list, options, run_metadata)
File "/home/xx/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device to node 'tower_0/softmax_linear/weight_loss_1': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
[[Node: tower_0/softmax_linear/weight_loss_1 = ScalarSummary[T=DT_FLOAT, _device="/device:GPU:0"](tower_0/softmax_linear/weight_loss_1/tags, tower_0/softmax_linear/weight_loss)]]
I try many solutions but failed. Thanks for any advice in advance.