Text detection on Seven Segment Display via Tesseract OCR

Question

The problem that I am running with is to extract the text out of an image and for this I have used Tesseract v3.02. The sample images from which I have to extract text are related to meter readings. Some of them are with solid sheet background and some of them have LED display. I have trained the dataset for solid sheet background and the results are some how effective.

The major problem I have now is the text images with LED/LCD background which are not recognized by Tesseract and due to this the training set isn't generated.

Can anyone guide me to the right direction on how to use Tesseract with the Seven Segment Display(LCD/LED background) or is there any other alternative that I can use instead of Tesseract.

LED background image 1 LED background image 2 Meter 1 with solid sheet background enter image description here

"I have trained the dataset for solid sheet background " .Would you please mind telling, how you achieved this ? — dev
@yunas have you made any progress on this? I am running into the same problem. — daniyalzade

Raymond Ma Raymond Ma · Accepted Answer · 2016-08-14T03:49:21

https://github.com/upupnaway/digital-display-character-rec/blob/master/digital_display_ocr.py

Did this using openCV and tesseract and the "letsgodigital" trained data

-steps include edge detection and extracting the display using the largest contour. Then threshold image using otsu or binarization and pass it through pytesseracts image_to_string function.

Text detection on Seven Segment Display via Tesseract OCR

3 Answers