How to allow soft device placement when deploying a TensorFlow model to GCP?

Question

I am trying to deploy a TensorFlow model to GCP's Cloud Machine Learning Engine for prediction, but I get the following error:

$> gcloud ml-engine versions create v1 --model $MODEL_NAME --origin $MODEL_BINARIES --runtime-version 1.9

Creating version (this might take a few minutes)......failed.
ERROR: (gcloud.ml-engine.versions.create) Bad model detected with error:  "Failed to load model: Loading servable: {name: default version: 1} failed: Invalid argument: Cannot assign a device for operation 'tartarus/dense_2/bias': Operation was explicitly assigned to /device:GPU:3 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.\n\t [[Node: tartarus/dense_2/bias = VariableV2[_class=[\"loc:@tartarus/dense_2/bias\"], _output_shapes=[[200]], container=\"\", dtype=DT_FLOAT, shape=[200], shared_name=\"\", _device=\"/device:GPU:3\"]()]]\n\n (Error code: 0)"

My model was trained on several GPUs, and it seems like the default machines on CMLE don't support GPU for prediction, hence the error I get. So, I am wondering if the following is possible:

Set the allow_soft_placement var to True, so that CMLE can use the CPU instead of the GPU for a given model.
Activate GPU prediction on CMLE for a given model.

If not, how can I deploy a TF model trained on GPUs to CMLE for prediction? It feels like this should be a straightforward feature to use, but I can't find any documentation about it.

Thanks!

I've been investigating and I think that the error is in the model itself or how it's saved, as you can use GPUs when deploying versions. Maybe the error comes from how you are using that GPU, could you add a snipped of that part for having more info? This is a wild guess, but can you try it with tf.device("/gpu:3"), as per documented — Iñigo
In case this clarifies my question: I am using tf.device("/gpu:x") when saving the model. If I export the model without the explicit device placement, everything works. I am wondering if there's a way to deploy such model for prediction without having to re-export it. — Oriol Nieto
Hi Oriol, you need to export your model before you deploy it to Cloud ML Engine for predictions. — Yurci
To be clear, I already exported the model. The question is how to use the soft placement without having to re-export the model. Is there any way of doing that? — Oriol Nieto

MatthewScarpino MatthewScarpino · Accepted Answer · 2018-08-21T18:54:42

I've never used gcloud ml-engine versions create, but when you deploy a training job with gcloud ml-engine jobs submit training, you can add a config flag that identifies a configuration file.

This file lets you identify the target machine for training, and you can use multiple CPUs and GPUs. The documentation for the configuration file is here.

How to allow soft device placement when deploying a TensorFlow model to GCP?

1 Answers