What is the simplest way to make object detector on C++ with Fast/Faster-RCNN?

Question

What is the simplest way to make object detector on C++ with Fast/Faster-RCNN and Caffe?

As known, we can use follow RCNN (Region-based Convolutional Neural Networks) with Caffe:

scores, boxes = im_detect(net, im, obj_proposals) which calls to def im_detect(net, im, boxes):

for this used rbgirshick/caffe-fast-rcnn, ROIPooling-layers and output bbox_pred

Faster RCNN: https://github.com/rbgirshick/py-faster-rcnn/blob/master/tools/demo.py#L82

scores, boxes = im_detect(net, im) which calls to def im_detect(net, im, boxes=None):

for this used rbgirshick/caffe-fast-rcnn, ROIPooling-layers and output bbox_pred

All of these use Python and Caffe, but how to do it on C++ and Caffe?

There is only C++ example for classification (to say what on image), but there is not for detecton (to say what and where on image): https://github.com/BVLC/caffe/tree/master/examples/cpp_classification

Is it enough to simply clone rbgirshick/py-faster-rcnn repository with rbgirshick/caffe-fast-rcnn, download the pre-tained model ./data/scripts/fetch_faster_rcnn_models.sh, use this coco/VGG16/faster_rcnn_end2end/test.prototxt and done a small change in CaffeNet C++ Classification example?

And how can I get output data from two layers bbox_pred and cls_score ?

Will I have all (bbox_pred & cls_score) in one array:

const vector<Blob<float>*>& output_blobs = net_->ForwardPrefilled();
Blob<float>* output_layer = output_blobs[0];
  const float* begin = output_layer->cpu_data();
  const float* end = begin + output_layer->channels();
  std::vector<float> bbox_and_score_array(begin, end);

Or in two arrays?

const vector<Blob<float>*>& output_blobs = net_->ForwardPrefilled();

Blob<float>* bbox_output_layer = output_blobs[0];
  const float* begin_b = bbox_output_layer ->cpu_data();
  const float* end_b = begin_b + bbox_output_layer ->channels();
  std::vector<float> bbox_array(begin_b, end_b);

Blob<float>* score_output_layer = output_blobs[1];
  const float* begin_c = score_output_layer ->cpu_data();
  const float* end_c = begin_c + score_output_layer ->channels();
  std::vector<float> score_array(begin_c, end_c);

I have the same question. Interesting if you have the answer and additional insights? As a final goal, I'd like to have Faster R-CNN as a service with RESTful API running on CPU. Thanks. — David Khosid
@David Khosid Not yet. My goal is maximum precision, and I want to use FASTER RCNN with ResNet MSRA which won on ImageNet. Using CPU for DNN is not good idea, but for maximum speed you can look at Darknet Yolo Tiny model based on darknet.conv.weights. Or may be SSD300: github.com/weiliu89/caffe/tree/ssd — Alex
Hey @Alex Did you find the answers ? I'm also interested in your topic :) Thank ! — lilouch
@lilouch The simplest Darknet-Yolo for Linux: pjreddie.com/darknet/yolo And you can find the latest fork yolo-windows on GitHub, if we choose from: Darknet-Yolo, Caffe-SSD, Caffe-FasterRCNN-ResNet. 1. Yolo DNN gets result as detected object, without many additional code. 2. Yolo has 3 types of Neural Net: default (4 GB), small (2 GB), tiny (1 GB GPU RAM required) - you can run it on any nVidia GPUs. 3. DNN-framework Darknet uses only C/C++/CUDA C and for its examples too, as opposed to Caffe forks SSD or FasterRCNN which use C/CUDA C/Python/Matlab this is good only for R&D. — Alex

dambromain dambromain · Accepted Answer · 2017-06-21T06:43:28

for those of you who are still looking for it, there is a C++ version of faster-RCNN with caffe in this project. You can even find a c++ api to include it in your project. I have successfully tested it.

What is the simplest way to make object detector on C++ with Fast/Faster-RCNN?

1 Answers