0
votes

I want to be able to recognize digits from images. So I have been playing around with tesseract and python. I looked into how to prepare the image and tried running tesseract on it and I must say I am pretty disappointed by how badly my digits are recognized. I have tried to prepare my images with OpenCV and thought I did a pretty good job (see examples below) but tesseract has a lot of errors when trying to identify my images. Am I expecting too much here? But when I look at these example images I think that tesseract should easily be able to identify these digits without any problems. I am wondering if the accuracy is not there yet or if somehow my configuration is not optimal. Any help or direction would be gladly appreciated.

Things I tried to improve the digit recognition: (nothing seemed to improved the results significantly)

  • limit characters: config = "--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789"
  • Upscale images
  • add a white border around the image to give the letters more space, as I have read that this improves the recognition process
  • Threshold image to only have black and white pixels

Examples:

Image 1:

Tesseract recognized: 72 enter image description here

Image 2:

Tesseract recognized: 0 enter image description here

EDIT: Image 3:

https://ibb.co/1qVtRYL

Tesseract recognized: 1723

1
Have you tried textract? github.com/deanmalmgren/textractanon
I think textract doesn't work with images or am I wrong? I want to be able to recognize these digits from images.Dynamicnotion

1 Answers

1
votes

I'm not sure what's going wrong for you. I downloaded those images and tesseract interprets them just fine for me. What version of tesseract are you using (I'm using 5.0)?

781429

209441

import pytesseract
import cv2
import numpy as np
from PIL import Image

# set path
pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\ichu\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe';

# load images
first = cv2.imread("first_text.png");
second = cv2.imread("second_text.png");
images = [first, second];

# convert to pillow
pimgs = [];
for img in images:
    rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB);
    pimgs.append(Image.fromarray(rgb));

# do text
for img in pimgs:
    text = pytesseract.image_to_string(img, config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789');
    print(text[:-2]); # drops newline + end char