Thinning the edge of a letter to keep only the center of the letter for OCR

Question

I'm trying to improve recognition ratio of complex letters such as Japanese/Chinese letter.

What kind of image processing should be done to make the letter in the left hand side makes the letter in the right hand side?

The idea is keeping the center of the letter (I'm not sure how to call it), to make the letter crispier, so OCR (such as Tesseract) recognition ratio will be improved.

If there's another approach to improve recognition ratio for such complex letter, it would be nice to know as well.

CommonSurname CommonSurname · Accepted Answer · 2016-11-30T03:45:35

You're looking for Skeletonization which can be done with morphological operators in OpenCV or Scikit-Image or MATLAB. Another option is a distance transform followed by a threshold as seen in the OpenCV Watershed example.

Thinning the edge of a letter to keep only the center of the letter for OCR

1 Answers