7
votes

I am working on reading identity card information using the Tesseract library. I have tried using some Google images and got good results, but when I went to real time images, that is when images are captured from an iPhone camera, I did not get good results.

I found some pre-processing steps suggested by Tesseract.

1. Fix DPI (if needed) 300 DPI is minimum.

How can I set the DPI of the image when capturing image from iPhone camera in real time?

2. Fix text size (e.g. 12 pt should be okay).

How do I fix the text size for the large image created by the iPhone camera?

3. Try to fix text lines (deskew and dewarp text).

I read that the Tesseract applies dewarp text using Leptonica library.Is dewarp or deskew needed for text at this pre-processing stage.?

4. Try to fix illumination of image (e.g. no dark part of image).

Can I perform illumination of the image using OpenCV?

5. Binarize and de-noise image.

I get poor binarized images when I apply a threshold or adaptive threshold for the real-time image.

How can I binarize these real-time images?

1

1 Answers

1
votes
    1. and 2.: When a text has a point size of 12, it means that it takes up 12 pixels of height at 72 DPI. At 300 DPI this is about 50 pixels. So what you should take from 1. and 2. is that you should try to make the resolution of the captured image so that the lines of text is around 50 pixels tall. How you would do this depends on how you are capturing the image.
    1. It is easier to ask the user to hold the camera straight :-)
    1. and 5.. you could try to apply some filtering. Again, it might be easier to ask the use to ensure proper lighting is applied.