2
votes

I am using tessnet2 as described in tessnet2 C# simple example.

 var image = new Bitmap(@"C:\OCRTest\number.jpg"); 
 var ocr = new Tesseract(); 
 ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only 
 //@"C:\OCRTest\tessdata" contains the language package, without this the method crash and app breaks 
 ocr.Init(@"C:\OCRTest\tessdata", "eng", true);  
 var result = ocr.DoOCR(image, Rectangle.Empty); 
 foreach (Word word in result) 
 Console.WriteLine("{0} : {1}", word.Confidence, word.Text); 
 Console.ReadLine(); 

But when I give an image as input which also contains English words and number or only English Words. It returns only numbers (numbers present in image and some extra numbers). I tried after commenting the third line but then it does not even recognize digits. Does anyone know how to use tessnet2 in C# so that it reads all letters, words and digits.

1

1 Answers

1
votes

Just edit the line

ocr.Init(@"C:\OCRTest\tessdata", "eng", true);

by

ocr.Init(@"C:\OCRTest\tessdata", "eng", false);

and comment out the third line

//ocr.SetVariable("tessedit_char_whitelist", "0123456789");

it will work.