0
votes

I am trying to use Tesseract to have OCR functionality in a Java application. To achieve this, I am using the Java/Tesseract bridge found here.

pom.xml dependency:

<dependency>
    <groupId>org.bytedeco.javacpp-presets</groupId>
    <artifactId>tesseract</artifactId>
    <version>3.04-1.1</version>
</dependency>

It works, I can use the library to OCRize an image. But when the Java program finishes, the JVM crashes. For a minimal example, even the very first Tesseract initialization line is enough:

import org.bytedeco.javacpp.tesseract.TessBaseAPI;

public class MinimalExample {

    public static void main(String[] args) {
        System.out.println("Hi!");
        TessBaseAPI tessAPI = new TessBaseAPI();
    }
}

If I run this main, it gives the following:

Hi!

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

And the following error message: Java(TM) Platform SE binary funktioniert nicht mehr – Windows kann online nach einer Lösung für das Problem suchen. (Java(TM) Platform SE binary does not work anymore – Windows can look for a solution to this problem online).

Problemsignatur:
  Problemereignisname:  APPCRASH
  Anwendungsname:   java.exe
  Anwendungsversion:    8.0.650.17
  Anwendungszeitstempel:    5614685f
  Fehlermodulname:  libgcc_s_dw2-1.dll
  Fehlermodulversion:   0.0.0.0
  Fehlermodulzeitstempel:   3f263ec2
  Ausnahmecode: 40000015
  Ausnahmeoffset:   000149a1
  Betriebsystemversion: 6.1.7601.2.1.0.256.49
  Gebietsschema-ID: 1031
  Zusatzinformation 1:  7309
  Zusatzinformation 2:  73092f5dbc78923c702ae5601110d2ea
  Zusatzinformation 3:  9fa1
  Zusatzinformation 4:  9fa11625863fb37077a4ab55be352b96

I've never had Java crashing before – but I've also never used natives before. ;-) Does anybody have a hint where to look for a solution to this strange behaviour?

Edit 2015-12-07: Using ListDLLs, I've seen that the DLL in question is located in C:\Users\...\AppData\Local\Temp\javacpp3256864312633\libgcc_s_dw2-1.dll, so "Wrong DLL from %PATH%" is not the answer.

1
Looks like an issue with MSYS2: sourceforge.net/p/msys2/mailman/msys2-users/thread/… Sounds like this is fixed in the latest version. Would need to rebuild to find out.Samuel Audet
@SamuelAudet: Do I understand this correctly, this would mean, the Tesseract-Libraries would have to be recompiled and the error lies there?Kurtibert
It looks like the issue lies in the C++ runtime, and we might need to rebuild. Simply replacing libgcc_s_dw2-1.dll with the newest version from MSYS2 might also work.Samuel Audet
Ohhh, I just got it now: Samiel Audet == saudet! ;) How could I replace the dll? It lies in a jar that is loaded by Maven, I do not know how I should interfere with that process without sacrifying the whole Maven process benefit.Kurtibert
Well, try it outside Maven and if it works, we'll figure something out.Samuel Audet

1 Answers

0
votes

It might have a problem with libwinpthread-1.dll.

Replace current libwinpthread-1.dll in jar with latest mingw32's dll, and it works fine.

  1. install msys2-x86_64-20150916.exe downloaded from https://msys2.github.io/ .
  2. install base-devel, mingw-w64-i686-toolchain using pacman.
  3. extract a leptonica-1.72-1.1-windows-x86.jar, and put all dlls into the same folder of your application.
  4. remove leptonica-1.72-1.1-windows-x86.jar from classpath.
  5. remove libwinpthread-1.dll from the folder (or replace the libwinpthread-1.dll to installed C:\msys64\mingw32\bin\libwinpthread-1.dll). A path "C:\msys64\mingw32\bin" seems to be loaded first, so if you can install mingw32 ,there is no need to remove(or replace) it.