4
votes

I am getting below error from tesseract for an image of size 5+ MB.

Tesseract Open Source OCR Engine v3.01 with Leptonica Page 0 Image too large: (39667, 56133) Error during processing.

Is there a limit on file size or is there a parameter to resolve this issue.

Appreciate your help..

2

2 Answers

13
votes

The maximum width and height are 32767.

From the source code (file baseapi.cpp):

    if (tesseract_->ImageWidth() > MAX_INT16 ||
        tesseract_->ImageHeight() > MAX_INT16) {
      tprintf("Image too large: (%d, %d)\n",
              tesseract_->ImageWidth(), tesseract_->ImageHeight());
1
votes

It's not the file size but rather the image size (dimension) that exceeds Tesseract limits. I have no problems with Tesseract recognizing 16MB image. Try resize or rescale your image and try again.