1
votes

Ok so I have spent the best part of today trying to even get ocr to work properly and it is no longer crashing but when I give it a file containing text rather that just numbers a lot of weird text is pumped out...

Source code:

using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using tessnet2;

namespace OCRTest
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                var image = new Bitmap(@"C:\Users\Ryan\Documents\visual studio 2015\Projects\OCRTest\testimage.jpg");
                var ocr = new Tesseract();
                ocr.Init(@"C:\Users\Ryan\Documents\visual studio 2015\Projects\OCRTest\tessdata", "eng", true);
                var result = ocr.DoOCR(image, Rectangle.Empty);
                foreach (Word word in result)
                {
                    Console.WriteLine("{0} : {1}", word.Confidence, word.Text);
                }
            }
            catch (Exception exception)
            {
                Console.WriteLine(exception);
            }
            Console.ReadLine();
        }
    }
}

Like I said I am using tessnet2 along with the eng tessdata.

When I input this image:

Test data image

I get this response from the program:

Result from program

Thanks in advance for any help or links to further tutorials you may have - I followed this tutorial to get thus far.. Ryan

1

1 Answers

1
votes

Fixed the issue - I did a stupid and set the last parameter of ocr.Init() to true rather than false...