Fast-RCNN final bounding box

Question

I've been playing around with Fast-RCNN for a while, but still can't get some of the core mechanisms.

In the tutorial slides (page 28 of http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf), they have an example output with only one bounding box per object:

http://s22.postimg.org/7rbu05xbl/Screen_Shot_2015_12_04_at_2_19_57_PM.png

Specifically, non-maximum suppression is performed on all region proposals(https://github.com/rbgirshick/fast-rcnn/blob/master/lib/fast_rcnn/test.py#L324), but in my case it still contains tens of regions for each object in the image.

My bounding boxes look like the following with threshold of 0.99:

http://s29.postimg.org/oc33ujgrb/foo.jpg

How and where are the bounding boxes for overlapping region finalized into one?

can please you post an image that exemplifies what you're trying to explain? — carlosdc
@carlosdc I somehow thought that bounding boxes are finalized into a few, but it may have been incorrect. Does fast-rcnn simply return the score, which should be dealt with by the user as they like? — ytrewq
@carlosdc for example, on page 28 of tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf, it seems like bounding boxes are finalized? — ytrewq
@carlosdc but in my case, there are tens of boxes around, say, one same car that are highly overlapping. How does it finalize one most fitting bounding box per object? — ytrewq

yossiB yossiB · Accepted Answer · 2016-09-13T12:22:30

Non-maximum supression should definitely filter out the overlapping bounding boxes in you example image. Check again that you use it properly, and do it after fine-tuning the initial bounding box using the network output.

Fast-RCNN final bounding box

2 Answers