1
votes

I'd like to figure out a method for finding bounding boxes of words or a pair of words in binary image. The image itself looks like this: (bounding boxes I need are marked by blue rectangles).

enter image description here

Image is free of any other objects. I'm thinking about some form of connected component analysis, like detecting single letters first, then "drawing" their bounding boxes on another Mat object in such a way that neighbouring letters connect. There is a useful information I'd like to utilize - word or a pair of words forms a horizontal line, which is an information that could be used to separate "Hello there" and "abcdf" - I just don't know how to do it.

2
What about a OCR library like tesseract? - Micka
I'm interested in bounding boxes, exact text location, not text recognition. - user4205580
Ok. I would start with a single contour and a simple heuristic whether to add another contour to the bounding rect or not. Maybe I'll find the time to test an implementation tomorrow. - Micka

2 Answers

2
votes
  1. Contour the image.
  2. Pick contours with a suitable area and width/height to be letters - get coords of centers.
  3. From list of centers decide how far apart 2 centers can be to be adjacent letters rather than a gap.
  4. Group these contours into a word and take their bounding box

Opencv has clustering, contour area and bounding box funcs if you don't want to do it yourself

2
votes
  1. Do OX-dilation using window size N, where N is approximate 1..2 size of letter width, then you will have black filled "boxes".
  2. Find contours ( see http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html ).
  3. Find rectangles and correct its with (minus approx 1 size of letter width) due to dilation width enlargement.