5
votes

I am trying to extract text from tyre images since background and foreground text is similar most of the OCR's(tried google OCR and tesseract) fail to detect text. Can you guys suggest some preprocessing steps for this task to increase the OCR efficiency

Sample image - enter image description here

I have tried thresholding and edge detection for these texts - I not getting proper output for thresholding but getting some lead with edge detection -

here is a result for Holistically-Nested Edge Detection with OpenCV -

enter image description here

1
Have you tried an edge filter maybe? Just as a first thoughtLeoE
Look into color thresholding. The text seems to be a slightly different shade so it might be able to differentiate from the tire.nathancy
@LeoE I have tried HED edge detection working for few cases but not all you can see my results above.Mukul
@nathancy I have tried thresholding with greyscale images only(Not working) don't know how to do with RGB images.Mukul

1 Answers

1
votes

A quick test will be the best way to feel the complexity and validate the approach. Let's use the following example:

enter image description here

Color thresholding is the first option to try and it works pretty well taking into account pretty much ideal initial conditions:

enter image description here

A bit different case will require additional tuning, so it might be really hard to develop a solution covering all the cases. Different lighting conditions will potentially lead to a completely different set of thresholds, etc.

enter image description here

Edge filter might provide additional insights, but "textured" letters will become a bit more tricky task. Finally, it might be possible to use NN (with a proper training set) to catch all the specific details, letters, numbers, etc, but no guarantee final accuracy will be high enough.