I'm working on a project in which I want to recognize text from a credit card sized document.The document contains details like name,phone number ,address etc. I'm capturing the image and pass the image into tesseract engine using
text = pytesseract.image_to_string(Image.open(filename), lang = 'eng'). Sometimes I'm getting decent results for each field but most of the time result is very bad. How do I resolve this issue ? What are the best practices. How the document readers work with OCR. Is it possible to process region based ocr in the document ?
1
votes
Preprocessing the image is extremely important. Generally you want the desired text in black and the background in white. Take a look at here1, here2, here3, here4
- nathancy
1 Answers
0
votes
A single approach can't read every text. You have to apply multiple approach for multiple types of pdf.
If the text is not horizontal, you have to rotate the text. If the text is curved, you have to use transformation (e.g. hog transform).
Moreover, to read text using the package, the texts should be clear and horizontal. Otherwise you need to create rules and transform them.