I am trying to convert a model to run with eager execution. However, I have encountered an odd error where it appears that even when tf.linspace and all of its arguments are placed on the GPU it still requires some copying to/from cpu memory. In particular consider this minimal example:
import tensorflow as tf
tfe = tf.contrib.eager
tfe.enable_eager_execution(config=tf.ConfigProto(allow_soft_placement=True,
log_device_placement=True), device_policy=tfe.DEVICE_PLACEMENT_WARN)
a = tf.constant(7.).gpu()
b = tf.constant(8.).gpu()
c = tf.constant(4).gpu()
with tf.device("/device:GPU:0"):
print(a)
print(tf.linspace(a,b,c))
This gives the following warnings:
2018-05-22 23:29:56.401000: W tensorflow/c/eager/c_api.cc:506] before computing LinSpace input #0 was expected to be on /job:localhost/replica:0/task:0/device:CPU:0 but is actually on /job:localhost/replica:0/task:0/device:GPU:0 (operation running on /job:localhost/replica:0/task:0/device:GPU:0). This triggers a copy which can be a performance bottleneck. 2018-05-22 23:29:56.401275: W tensorflow/c/eager/c_api.cc:506] before computing LinSpace input #1 was expected to be on /job:localhost/replica:0/task:0/device:CPU:0 but is actually on /job:localhost/replica:0/task:0/device:GPU:0 (operation running on /job:localhost/replica:0/task:0/device:GPU:0). This triggers a copy which can be a performance bottleneck. 2018-05-22 23:29:56.401534: W tensorflow/c/eager/c_api.cc:506] before computing LinSpace input #2 was expected to be on /job:localhost/replica:0/task:0/device:CPU:0 but is actually on /job:localhost/replica:0/task:0/device:GPU:0 (operation running on /job:localhost/replica:0/task:0/device:GPU:0). This triggers a copy which can be a performance bottleneck.
I am running tensorflow 1.8 on python 2.7.