18
votes

I have searched on the net for a couple of hours. I got many answers saying we need to use NDK, etc. for "Tesseract" for WINDOWS.

But I didn't get any step-by-step/proper explanation of what should be done when NDK is installed. How to get the .so files? I have finished installing NDK and Cygwin. To check if it's done properly, I entered make -v and it gave the expected output.

Can anyone who has used "Tesseract" tell me how they have done it? (I have downloaded "Mezzofanti", but there I didn't find any of the "Tesseract" files.)

4

4 Answers

20
votes

You need to use tess-two project for working with Tesseract on Android.
The tess-two contains tools for compiling the Tesseract and Leptonica libraries for use on the Android platform. It provides a Java API for accessing natively-compiled Tesseract and Leptonica APIs.

Adding tess-two to your project:

add to build.gradle:

dependencies {
    compile 'com.rmtheis:tess-two:5.4.1'
}

Using Tesseract:

import com.googlecode.tesseract.android.TessBaseAPI;

private String extractText(Bitmap bitmap) throws Exception{
    TessBaseAPI tessBaseApi = new TessBaseAPI();
    tessBaseApi.init(DATA_PATH, "eng");
    tessBaseApi.setImage(bitmap);
    String extractedText = tessBaseApi.getUTF8Text();
    tessBaseApi.end();
    return extractedText;
}

You can looking on my simple one-class example of using Tesseract for Android. It contains only 200 lines of Java code.

15
votes

You can refer this document, It gives ths step by step But you need to do is to set up the tesseract-android-tools project as a library project in Eclipse, and tell your project to refer to the library project. So you’ll need two projects in Eclipse,

http://rmtheis.wordpress.com/2011/08/06/using-tesseract-tools-for-android-to-create-a-basic-ocr-app/

I hope this help.....

0
votes

This video shows you exactly how it is done

How can I use Tesseract in Android?

Make sure to: 1. Create the folder 2. in that folder you have to put the traineddata file (You can download it from here in the language you require https://github.com/tesseract-ocr/tessdata/tree/3.04.00 ) 3. Reference the path to the folder cointining the traineddata file and state the language: tessBaseApi.init(DATA_PATH, "eng");

Hope it helps