12
votes

I'd like to use the Tensorflow Object Detection API to identify objects in a series of webcam images. The Faster RCNN models pre-trained on the COCO dataset appear to be suitable, as they contain all the object categories I need.

However, I'd like to improve the performance of the model at identifying fairly small objects within each image. If I understand correctly, I need to edit the anchor scales parameter in the config file to get the model to use smaller bounding boxes.

My questions are:

  • Do I need to re-train the model on the entire COCO dataset after I adjust this parameter? Or is there a way to change the model just for inference and avoid any re-training?
  • Are there any other tips/tricks to successfully identifying small objects, short of cropping the image into sections and running inference on each one separately?

Background info

I'm currently feeding 1280x720 images to the model. At around 200x150 pixels I'm finding it harder to detect objects.

1

1 Answers

7
votes
  1. You'll need to retrain completely unfortunately, since the weights do depend on the shape of the anchor.

  2. Having a feature map with higher resolution should help (but slow down the process), so changing the feature extractor to get one with less input size reduction (max poolings with stride >1 is usually what reduces the space size) or upscaling the image a bit in the initial image resizer.