1
votes

I have created a program that prints image as text.

Environment Variable

  • Variable name : pytesseract
  • Variable value: pytesseract.pytesseract.tesseract_cmd= r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

//code

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract

pytesseract.pytesseract.tesseract_cmd= r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

image1 = Image.open("C:\python\program\image.png")

print(pytesseract.image_to_string(image1))

Error :

Traceback (most recent call last):
  File "C:/python/program/Image_OCR.py", line 13, in <module>
    print(pytesseract.image_to_string(image1))
  File "C:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 338, in image_to_string
    }[output_type]()
  File "C:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 337, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "C:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 246, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\python\venv\lib\site-packages\pytesseract\pytesseract.py", line 222, in run_tesseract
    raise TesseractError(proc.returncode, get_errors(error_string))

pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

Actual result: Print image as text.

1
set TESSDATA_PREFIX environment variable to C:\Program Files (x86)\Tesseract-OCRSmart Manoj
after changes, it will show the same error.Mehul Jadav
add import os; os.environ['TESSDATA_PREFIX']='C:\Program Files (x86)\Tesseract-OCR'Smart Manoj
try it, but didn't work...same errorMehul Jadav
It seems to be unable to find \\Program Files (x86)\\Tesseract-OCR\\tessdata/eng.traineddata so you are either not telling it the correct place or you don't have that file. Please clarify the situation as regards to those two possibilities.Mark Setchell

1 Answers

2
votes

I had the same problem. I am using Ubuntu. I commented the below line pytesseract.pytesseract.tesseract_cmd = '/app/.apt/usr/bin/tesseract'
It worked for me.
Try removing/commenting the line
pytesseract.pytesseract.tesseract_cmd= r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe