0
votes

I am extracting image matrices using opencv from a desktop application screenshot and using tesseract to read the cropped images. For instance, if you refer to the images below, Tesseract extracts "Relationship" image to be R’e‘auunshwp and member as Mamba!

Is the quality of attaching image too low for tesseract? What can I do to improve it?

enter image description here enter image description here

UPDATE

I used the following code to resize image, which improved OCR quality. But, how do I calculate what would be 300DPI and then how do I ensure that the aspect ratio of the image remains the same at the scale?

    Mat resizedMat = new Mat();
    Size sz = new Size(mat.rows()*10,mat.cols()*10);
    Imgproc.resize(mat,resizedMat,sz);
3

3 Answers

3
votes

Resolution is too low. Try rescaling to 300DPI.

1
votes

As nguyenq said, you should rescale your image, because tesseract struggles to scan low quality images.

I answered a similar question HERE for another person, you should try to do the same. Increase your image size by 200-400%, if only this does not help, do some blurring and then threshold.

1
votes

I finally solved it with this code, using OpenCV -

Mat resizedMat = new Mat();
double width = mat.cols();
double height = mat.rows();
double aspect = width / height;
Size sz = new Size(width * aspect * 2, height * aspect * 2);
Imgproc.resize(mat, resizedMat, sz);