8
votes

I have this problem in my web application in Tomcat 9:

Error opening data file ./tessdata/eng.traineddata Please make sure the TESSDATAPREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'eng' Tesseract couldn't load any languages!

I create folder C:\Tess4J from Tess4J 3.0.4 zip, with subfolders:

  • dist
    • tess4j-3.0.jar
  • lib
    • LIBS
  • nbproject
  • src
  • tessdata
    • Downloaded ZIP with languages and extracted here
  • test

In catalina.properties i add:

  • C:/Tess4J/dist/tess4j-3.0.jar,C:/Tess4J/lib

In environments i try to add both, and doesn't work:

  • TESSDATA_PREFIX --> C:/Tess4J
  • TESSDATA_PREFIX --> C:/Tess4J/tessdata

Then i invoke my servlet whit doOCR method, and i've got error.

May you help me please?

1
Are you sure you are using the 3.0 tesseract version (it is incopatible with the older version)? The tessdata folder should contain data like "eng.traineddata", "eng.cube.bigrams", "eng.cube.fold" etc. You can download theme here: github.com/tesseract-ocr/tessdataRadim Burget

1 Answers

15
votes

You have to choose the up directory, in your case C:\Tess4J, try it:

tessInst= new Tesseract();
tessInst.setDatapath("C:\\Tess4J"); 
tessInst.setLanguage("eng");

Sorry about my english