Store Tensorflow object detection API image output with boxes in CSV format

Question

I am referring to Google's Tensor-Flow object detection API. I have successfully trained and tested the objects. My question is after testing I get output image with box drawn around an object, how do I get csv coordinates of these boxes? code for testing can be found on (https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb)

If I see the helper code it loads the image into numpy array:

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

In detection it takes this array of images and give output with box drawn as follows

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      # Actual detection.
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      plt.figure(figsize=IMAGE_SIZE)
      plt.imshow(image_np)

I want to store the coordinates of these green boxes in a csv file.What is a way to do it?

it is storing values in a variable called 'boxes'. I output the results of variable 'boxes'. It shows [[[ 2.44699568e-01 2.14029700e-02 3.81645471e-01 2.81205386e-01] [ 1.58572584e-01 4.91167933e-01 2.97784775e-01 7.96089888e-01] [ 3.64572904e-03 6.43181324e-01 7.87424743e-02 9.45716262e-01] These values do not make sense to me anyway we can extract data from this numbers? — Ajinkya

ITiger ITiger · Accepted Answer · 2018-01-22T14:59:00

The coordinates in the boxes array ([ymin, xmin, ymax, xmax]) are normalized. Therefore, you have to multiply them with the images width / height to obtain the original values.

To achieve this, you can do something like the following:

for box in np.squeeze(boxes):
    box[0] = box[0] * heigh
    box[1] = box[1] * width
    box[2] = box[2] * height
    box[3] = box[3] * width

Then you can save the boxes to your csv using the numpy.savetxt() method:

import numpy as np
np.savetxt('yourfile.csv', boxes, delimiter=',')

Edit:

As pointed out in the comments, the approach above gives a list of box coordinates. This is due to the fact, that the boxes tensor holds the coordinates of every detected region. One quick fix for me is the following, assuming that you use the default confidence acceptance threshold of 0.5:

  for i, box in enumerate(np.squeeze(boxes)):
      if(np.squeeze(scores)[i] > 0.5):
          print("ymin={}, xmin={}, ymax={}, xmax{}".format(box[0]*height,box[1]*width,box[2]*height,box[3]*width))

This should print you the four values, and not four boxes. Each of the values represents one corner of the bounding box.

If you use another confidence acceptance threshold you have to adjust this value. Maybe you can parse the model configuration for this parameter.

To store the coordinates as CSV, you can do something like:

new_boxes = []
for i, box in enumerate(np.squeeze(boxes)):
    if(np.squeeze(scores)[i] > 0.5):
        new_boxes.append(box)
np.savetxt('yourfile.csv', new_boxes, delimiter=',')

Store Tensorflow object detection API image output with boxes in CSV format

1 Answers

Edit: