Tesseract-OCR (3.02) recognition accuracy and speed

Question

I have group of very small images (w:70-100 ; h:12-20), like the one below:

In those images nothing but nickname of group's member. I want to read the text from simple images, they all have one background, only nickames are different. So, what I've done with that image:

I am using code below to get text from second image:

tesseract::TessBaseAPI ocr;
ocr.Init(NULL, "eng");
PIX* pix = pixRead("D:\\image.png");
ocr.SetImage(pix);
std::string result = ocr.GetUTF8Text();

I have 2 problems with that:

The ocr.GetUTF8Text(); is working slow: 650-750ms. Image is small, why it works so long anyway?
From the image above I am getting result like: "iwillkillsm", "iwillkillsel" etc. That image is simple, and I believe tesseract gurus are able to recognize it with 100% accuracy.

What should I do with image/code or what should I read (and where) about tesseract-ocr (something about text speed and quality recognition) to solve those problems?

I had the best luck with tesseract when I would greatly increase the dimensions of the images. — nlloyd
@nlloyd After increasing dimension, I have some better results (speed&acuracy), thank You! But I have to ask: is that okay, that after resizing I have some gray or almost black pixels in the image? That situation helps tesseract or not? — Anton Kasabutski
seems ok to me. i always made the images bigger before feeding them to tesseract; there's a limit to how big you can make them before you start getting worse results however :) — nlloyd

nlloyd nlloyd · Accepted Answer · 2016-07-02T06:25:43

It may sound odd, but I've always had the best luck with tesseract when I increased the dimensions of the image. The image would look "worse" to me but tesseract went faster and had much better accuracy.

There is a limit to how big you can make the images before you start getting worse results however :) I think I remember shooting for 600px in the past. You'll have to play with it though.

Tesseract-OCR (3.02) recognition accuracy and speed

1 Answers