42
votes

I get often confused with the meaning of the term descriptor in the context of image features. Is a descriptor the description of the local neighborhood of a point (e.g. a float vector), or is a descriptor the algorithm that outputs the description? Also, what exactly is then the output of a feature-extractor?

I have been asking myself this question for a long time, and the only explanation I came up with is that a descriptor is both, the algorithm and the description. A feature detector is used to detect distinctive points. A feature-extractor, however, does then not seem to make any sense.

So, is a feature descriptor the description or the algorithm that produces the description?

2
Thanks for the quick reply, unfortunately, this led to even more confusion.Richard

2 Answers

88
votes

A feature detector is an algorithm which takes an image and outputs locations (i.e. pixel coordinates) of significant areas in your image. An example of this is a corner detector, which outputs the locations of corners in your image but does not tell you any other information about the features detected.

A feature descriptor is an algorithm which takes an image and outputs feature descriptors/feature vectors. Feature descriptors encode interesting information into a series of numbers and act as a sort of numerical "fingerprint" that can be used to differentiate one feature from another. Ideally this information would be invariant under image transformation, so we can find the feature again even if the image is transformed in some way. An example would be SIFT, which encodes information about the local neighbourhood image gradients the numbers of the feature vector. Other examples you can read about are HOG and SURF.


EDIT: When it comes to feature detectors, the "location" might also include a number describing the size or scale of the feature. This is because things that look like corners when "zoomed in" may not look like corners when "zoomed out", and so specifying scale information is important. So instead of just using an (x,y) pair as a location in "image space", you might have a triple (x,y,scale) as location in "scale space".

0
votes

For the descriptor, I understand as the description of the neighborhood of a point on the image. In other words, it is a vector in the image (descriptions of the visual features of the contents in images).

For example, there is method in the HOG (Histogram of Oriented Gradients) called Image Gradients and Spatial/Orientation Binning. The extractHOGFeatures in Matlab and Classification using HOG had visual examples for better understanding.