1
votes

I currently follow the tutorial to retrain Inception for image classification: https://cloud.google.com/blog/big-data/2016/12/how-to-train-and-classify-images-using-google-cloud-machine-learning-and-cloud-dataflow

However, when I make a prediction with the API I get only the index of my class as a label. However I would like that the API actually gives me a string back with the actual class name e.g instead of

 ​predictions: 
 - key: '0'   
   prediction: 4   
   scores:   
   - 8.11998e-09   
   - 2.64907e-08   
   - 1.10307e-06   

I would like to get:

​predictions: 
 - key: '0'   
   prediction: ROSES   
   scores:   
   - 8.11998e-09   
   - 2.64907e-08   
   - 1.10307e-06   

Looking at the reference for the Google API it should be possible: https://cloud.google.com/ml-engine/reference/rest/v1/projects/predict

I already tried to change in the model.py the following to

outputs = {
    'key': keys.name,
    'prediction': tensors.predictions[0].name,
    'scores': tensors.predictions[1].name
}
tf.add_to_collection('outputs', json.dumps(outputs))

to

 if tensors.predictions[0].name == 0:
     pred_name ='roses'
 elif tensors.predictions[0].name == 1:
     pred_name ='tulips'


outputs = {
    'key': keys.name,
    'prediction': pred_name,
    'scores': tensors.predictions[1].name
}
tf.add_to_collection('outputs', json.dumps(outputs))

but this doesn't work.

My next idea was to change this part in the preprocess.py file. So instead getting the index I want to use the string label.

  def process(self, row, all_labels):
    try:
      row = row.element
    except AttributeError:
      pass
    if not self.label_to_id_map:
      for i, label in enumerate(all_labels):
        label = label.strip()
        if label:
          self.label_to_id_map[label] = label #i

and

label_ids = []
for label in row[1:]:
  try:
    label_ids.append(label.strip())
    #label_ids.append(self.label_to_id_map[label.strip()])
  except KeyError:
    unknown_label.inc()

but this gives the error:

TypeError: 'roses' has type <type 'str'>, but expected one of: (<type 'int'>, <type 'long'>) [while running 'Embed and make TFExample']

hence I thought that I should change something here in preprocess.py, in order to allow strings:

    example = tf.train.Example(features=tf.train.Features(feature={
        'image_uri': _bytes_feature([uri]),
        'embedding': _float_feature(embedding.ravel().tolist()),
    }))

if label_ids:
  label_ids.sort()
  example.features.feature['label'].int64_list.value.extend(label_ids)

But I don't know how to change it appropriately as I could not find someting like str_list. Could anyone please help me out here?

1

1 Answers

0
votes

Online prediction certainly allows this, the model itself needs to be updated to do the conversion from int to string.

Keep in mind that the Python code is just building a graph which describes what computation to do in your model -- you're not sending the Python code to online prediction, you're sending the graph you build.

That distinction is important because the changes you have made are in Python -- you don't yet have any inputs or predictions, so you won't be able to inspect their values. What you need to do instead is add the equivalent lookups to the graph that you're exporting.

You could modify the code like so:

labels = tf.constant(['cars', 'trucks', 'suvs'])
predicted_indices = tf.argmax(softmax, 1)
prediction = tf.gather(labels, predicted_indices)

And leave the inputs/outputs untouched from the original code