3
votes

I am using OpenCV 2.4 and Tesseract 3

I am trying to do an OCR on a 14-segment display from a webcam.

The issue is that when I trained Tesseract, I had to do enough erosion/dilation to fill the gaps of each segments. But, the image I am reading from the webcam needs to be pre-processed to remove noises. To do this, I use erosions and dilations and the resulting picture doesn't have its segments linked :

The result of OCR-ing is always different and can be "OVO" as well as "EB". I thought that maybe if I trained tesseract with a more similar version of what I am actually reading (non-linked segments) it could work better but Tesseract can't be trained with blank spaces like this (it says "Empty page").

Does anyone have any idea on how to solve this ?

I tried to increase the size of erosion/dilation but then other letters aren't recognized (B and D are confusing) and overall results is lower.

Thank you !

EDIT : Basically, what I'd need is a way to link the segments together to make it easier for tesseract to read the character OR a way to train tesseract with unlinked segments (from what I've seen, that can't happen)

1
@user2950911 I just tried this : get the skeleton and then dilate the result and for the 'V' letter, it seems to be good but then the 'B' and 'D' become 99% similar. I believe working on the skeleton would work great if we could, somehow, remove the little branches nears the corners (for a rectangle, the skeleton is something like this : >---< instead of ----- )pHDa
Actually, I tried with a thinning algorithm which doesn't give the little branches near the corners but the issue remains. If I dilate too much in order to link the segments together for the letter 'V', other characters won't work like 'B' and 'D'.pHDa

1 Answers

0
votes

Isn't it possible to skip tessaract for this? It looks like you already have a way of partitioning your image into separate characters. Then you could number the segments of your display, perhaps like it is shown here http://www.randomdata.nl/wiki/index.php/Adruino_14_segment_LED_board and just decide which of your segments are currently lighting up. Then you can match that against the known combinations of segments lighting up for all characters with some form of nearest distance algorithm to find the best match.

Sticking to the scheme linked above your V could perhaps be encoded as follows:

segment number: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
switched on:    0 1 1 0 0 0 1 0 1 0  0  0  0  0