How to detect an object in an image using HOG Descriptors?

Question

When tracking an object, I want to be able to re-detect it after an occlusion.

On OpenCV 3.4.5 (C++), I tried template matching and optical flow segmentation. But now, I would like to implement a more robust algorithm using HOG descriptor.

I made a little example to show the problem. Here are my 2 images :

the vehicle I want to detect

the image in which I'm searching

PS : I don't want to train a SVM since I want to detect a unique object in a few frames only.

My code :

#include <opencv2/core/utility.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/objdetect.hpp>

#include <iostream>
#include <vector>

using namespace std;
using namespace cv;

int main(int argc, char** argv){

        //load images
        Mat lastSeenObject=imread("lastSeenObject.png",1); //21x39
        Mat patch=imread("patch.png",1); //150x150

        //params
        Size cellSize(8,8);
        int nbins= 9;
        Size blockSize(2,2);

        //my variables
        vector<float>templ_descriptor;
        vector<float>p_descriptor;
        Mat templ_gray,p_gray,iMatches;
        vector<DMatch> matches;

        //convert to gray
        cvtColor(lastSeenObject,templ_gray,CV_BGR2GRAY);
        cvtColor(patch,p_gray,CV_BGR2GRAY);

        //create hog object
        HOGDescriptor hog(Size(templ_gray.cols/cellSize.width*cellSize.width,templ_gray.rows/cellSize.height*cellSize.height),
                Size(blockSize.height*cellSize.height,blockSize.width*cellSize.width),
                Size(cellSize.height,cellSize.width),
                cellSize,
                nbins);
        // gives --> winSize [32 x 16],  blockSize [16 x 16],  blockStride [8 x 8],  cellSize [8 x 8]

        //compute the descriptor of the car
        hog.compute(templ_gray,templ_descriptor, Size(cellSize.height,cellSize.width), Size( 0, 0 ));
        //templ_descriptor.size() = 108, containing floats between 0 and 1

        //compute the descriptor of the patch
        hog.compute(p_gray,p_descriptor, Size(cellSize.height,cellSize.width), Size( 0, 0 ));
        //p_descriptor.size() = 27540, containing floats between 0 and 1

        //compare the descriptors
        double err=0;
        double min_err = -1;
        int idx=-1;
        for (unsigned int i =0;i<p_descriptor.size();i++)
        {
            if(i%templ_descriptor.size()==0 && i!=0) // iterate the computation of error over the templ_descriptor size
            {
                if(err<min_err || min_err ==-1)
                {
                    min_err = err;
                    idx = i-nbins;
                }
                err = 0;
            }
            //euclidean error distance accumulator between each component of the histogram
            err += abs(p_descriptor[i] - templ_descriptor[i%templ_descriptor.size()]);
        }

        // we get idx = 11655 and err = 5.34021

        //convert vector idx in x,y coordonates in the patch
        int row= static_cast<int>(idx/patch.cols);
        int col = idx%patch.cols;

        //show the result
        Rect2f found_object(col,row,hog.winSize.width,hog.winSize.height); // [32 x 16 from (105, 77)]
        rectangle(patch,found_object,Scalar(0,0,255));
        imshow("result",patch);
        waitKey(500000);

        return 1;
}

My result

Of course the expected result is to have the bounding box on the vehicle.

My questions

1/ How the descriptor returned by the function compute is structured?

I assume there are 9 (nBins) floats describing a cellSize, but I don't get why I have 108/9 = 12 cells in templ_descriptor while the winSize is 16x32 and the cellSize 8x8.

2/ How to retrieve the pixel coordinates of the winSize from p_descriptor which matches the best with templ_descriptor ?

3/ Do you have any other suggestions to solve my issue of redetecting my target after small occlusions ?

Helpful links

OpenCV 3.4.5 documentation on HOG Descriptor

LearnOpenCV article on HOG

You should look into SIFT instead of HOG - this way you are independent of the scale and rotation of your object, which HOG does not account for. — T A
Yes but the SIFT algorithm is under non-free licence on OpenCV... Using, the SIFT algorithm, I should extract some points from the target (from goodFeaturesToTrack for example) and then match it with every descriptor of every pixel in the patch? — emlot77
A simple HOG tracking is not robust. A very good tracking algorithm with HOG is STAPLE: robots.ox.ac.uk/~luca/staple.html You can test it: github.com/xuduo35/STAPLE If you want a best algorithm with speed/robustness - use KCF. If you want a more robustness - use CSRT or STAPLE. — Nuzhny
I am already using CSRT from OpenCV. I just want to implement a plugin to deal with occlusions when CSRT fails ! — emlot77

Everley Tseng Everley Tseng · Accepted Answer · 2019-03-28T13:45:15

Try using SIFT. To use SIFT in opencv3, you'll have to build opencv with contrib ON.

If you still want to try if HOG works. Try computing distance between two vectors - the descriptor of the vehicle image and each descriptor of the larger image. Vector sizes should be the same if you set blockSize = (vehicle image size).

HOG's disadvantage is it's weak tolerance for rotation. SIFT bears objects to rotate by every angle. However, SIFT is more into details patterns on the objects, so it may be more risky when the image resolution is rather small.

How to detect an object in an image using HOG Descriptors?

1 Answers