I'm working on performing OCR of energy meter displays: example 1 example 2 example 3
I tried to use tesseract-ocr with the letsgodigital trained data. But the performance is very poor.
I'm fairly new to the topic and this is what I've done:
import numpy as np
import cv2
import imutils
from skimage import exposure
from pytesseract import image_to_string
import PIL
def process_image(orig_image_arr):
gry_disp_arr = cv2.cvtColor(orig_image_arr, cv2.COLOR_BGR2GRAY)
gry_disp_arr = exposure.rescale_intensity(gry_disp_arr, out_range= (0,255))
#thresholding
ret, thresh = cv2.threshold(gry_disp_arr,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
return thresh
def ocr_image(orig_image_arr):
otsu_thresh_image = process_image(orig_image_arr)
cv2_imshow(otsu_thresh_image)
return image_to_string(otsu_thresh_image, lang="letsgodigital", config="--psm 8 -c tessedit_char_whitelist=.0123456789")
img1 = cv2.imread('test2.jpg')
cnv = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
text = ocr_image(cnv)
This gives very poor results with the example images. I have a couple of questions:
How can I identify the four corners of the display? (Edge detection doesn’t seem to work very well)
Is there any futher preprocessing that I can do to improve the performance?
Thanks for any help.