Locate words in binary image

Question

I'd like to figure out a method for finding bounding boxes of words or a pair of words in binary image. The image itself looks like this: (bounding boxes I need are marked by blue rectangles).

enter image description here

Image is free of any other objects. I'm thinking about some form of connected component analysis, like detecting single letters first, then "drawing" their bounding boxes on another Mat object in such a way that neighbouring letters connect. There is a useful information I'd like to utilize - word or a pair of words forms a horizontal line, which is an information that could be used to separate "Hello there" and "abcdf" - I just don't know how to do it.

I'm interested in bounding boxes, exact text location, not text recognition. — user4205580
Ok. I would start with a single contour and a simple heuristic whether to add another contour to the bounding rect or not. Maybe I'll find the time to test an implementation tomorrow. — Micka

Martin Beckett Martin Beckett · Accepted Answer · 2014-11-01T15:07:48

Contour the image.
Pick contours with a suitable area and width/height to be letters - get coords of centers.
From list of centers decide how far apart 2 centers can be to be adjacent letters rather than a gap.
Group these contours into a word and take their bounding box

Opencv has clustering, contour area and bounding box funcs if you don't want to do it yourself

Locate words in binary image

2 Answers