0
votes

In the moment I am doing some research with the object detection API from tensorflow. For this I followed this tutorial:

https://www.oreilly.com/ideas/object-detection-with-tensorflow

This tutorial describes how to generate tfrecord from images and also PASCAL VOC xml label files. As well as getting started with the object detection API.

To generate those tfrecords I modified some code from the referenced raccoon repository on github:

https://github.com/datitran/raccoon_dataset

I labeled my images with LabelImg (https://github.com/tzutalin/labelImg) there you have the possibility to save in PASCAL VOC format.

So now I followed the tutorial and did a first test (training) with 60 images and after an hour (574 steps) I did an interrupt to save the checkpoint. After this I did an export "graph for inference.py" and saved the frozen model (correct me if I'm saying something stupid, this stuff is also new to me...)

And after this I modified the jupyter notebook from the tutorial for my desires and tada there is some recognition in test images.

So far so good, but now I want to see how good (accuracy) the object detection is and for this I wanted to add some ground truth boxes from my test PASCAL VOC dataset. But I am having some troubles to acquire my goal.

The first thing I was doing is to manually add boxes which I read from my VOC dataset and add them to the image, which I made with https://matplotlib.org/devdocs/api/_as_gen/matplotlib.patches.Rectangle.html

But in my solution this is getting different plots/figures....

So then I thought maybe the object detection API provides some functions to add boxes/ground truth boxes and evaluates the accuracy from my detection with my test VOC dataset.

So I thought I take a look at https://github.com/tensorflow/models/tree/master/research/object_detection/utils and I thought I did found a function (def draw_bounding_box_on_image_array ) to make some boxes to my image_np, but nothings happens so this is what the API uses to do some visualization:

vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      np.squeeze(boxes),
      np.squeeze(classes).astype(np.int32),
      np.squeeze(scores),
      category_index,
      use_normalized_coordinates=True,
      line_thickness=2)

and this what I have tried to use:

vis_util.draw_bounding_box_on_image(
        image_np,
        bndbox_coordinates[0][1],
        bndbox_coordinates[0][0],
        bndbox_coordinates[0][3],
        bndbox_coordinates[0][2])

but there arent boxes if I try to plot this numpy array image

Am I missing something? And question 2 is there some class in the API which is doing the evaluation for accuracy? I don't see with my dried eyes... And if does this class/function use PASCAL VOC to determine? Mybe I can use this: https://github.com/tensorflow/models/blob/master/research/object_detection/utils/object_detection_evaluation.py but I'm not confident because I am also new to python and some code/comments are hard for me to understand...

Maybe you professional guys out there can help me.

EDIT:

I have read a little bit from this article: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/

and now I know that I need an IoU (Intersection over Union) - so does anyone know if the object detection API provides a function for this? I will look again into the API...

2

2 Answers

0
votes

I feel your are not passing the complete parameters

vis_util.visualize_boxes_and_labels_on_image_array(
  image_np,
  np.squeeze(boxes),
  np.squeeze(classes).astype(np.int32),
  np.squeeze(scores),
  category_index,
  use_normalized_coordinates=True,
  line_thickness=2)

You need to pass image_np=ImageID, np.squeeze(boxes)=bounding box coordinates, np.squeeze(classes).astype(np.int32)=to which class this object belongs to, np.squeeze(scores)=confidence score that will always be 1

0
votes

If you just want to some old-school Python code then you can leverage some helpful functions inside TensorFlow's object detection utils folder: [https://github.com/tensorflow/models/blob/master/research/object_detection/utils/visualization_utils.py]

You can then overlay the ground truth box and the predicted boxes on the original images by using the [y_min, x_min, y_max, x_max] coordinates of both the boxes. Look more into object_detection_tutorial.ipynb to check out load_image_into_numpy_array function. For example, to display the ground truth box having coordinates [90, 42, 125, 87], you can do something like this:

    from PIL import Image
    from utils import visualization_utils as vis_util
    from matplotlib import pyplot as plt
    import numpy as np

    def load_image_into_numpy_array(image): #adopted from object_detection_tutorial.ipynb
      (im_width, im_height) = image.size
      return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)


    image = Image.open(image_path)
    plt.figure(figsize=IMAGE_SIZE)
    image_np = load_image_into_numpy_array(image) # numpy array with shape [height, width, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)

    '''
        <xmin>42</xmin>
        <ymin>90</ymin>
        <xmax>87</xmax>
        <ymax>125</ymax>
    '''

    vis_util.draw_bounding_box_on_image_array(image_np, 90, 42, 125, 87, color='red', thickness = 2, use_normalized_coordinates = False)
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image_np)
    plt.show()