Goal
Getting the same quality result when using OpenCV Mat
as when using Leptonica Pix
when doing OCR with Tesseract.
Environment
C++17, OpenCV 3.4.1, Tesseract 3.05.01, Leptonica 1.74.4, Visual Studio Community 2017, Windows 10 Pro 64-bit
Description
I'm working with Tesseract and OCR, and have found what I think is a peculiar behaviour.
And this is my code:
#include "stdafx.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#pragma comment(lib, "ws2_32.lib")
using namespace std;
using namespace cv;
using namespace tesseract;
void opencvVariant(string titleFile);
void leptonicaVariant(const char* titleFile);
int main()
{
cout << "Tesseract with OpenCV and Leptonica" << endl;
const char* titleFile = "raptor-companion-2.jpg";
opencvVariant(titleFile);
leptonicaVariant(titleFile);
cout << endl;
system("pause");
return 0;
}
void opencvVariant(string titleFile) {
cout << endl << "OpenCV variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Mat image = imread(titleFile);
ocr.SetImage(image.data, image.cols, image.rows, 1, image.step);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
void leptonicaVariant(const char* titleFile) {
cout << endl << "Leptonica variant..." << endl;
TessBaseAPI ocr;
ocr.Init(NULL, "eng");
Pix *image = pixRead(titleFile);
ocr.SetImage(image);
char* outText = ocr.GetUTF8Text();
int confidence = ocr.MeanTextConf();
cout << "Text: " << outText << endl;
cout << "Confidence: " << confidence << endl;
}
The methods opencvVariant
and leptonicaVariant
is basically the same except that one is using the class Mat
from OpenCV and the other Pix
from Leptonica. Yet, the result is quite different.
OpenCV variant...
Text: Rapton
Confidence: 68
Leptonica variant...
Text: Raptor Companion
Confidence: 83
As one can see in the output above, the Pix
variant gives a much better result than the Mat
variant. Since my code relies heavily on OpenCV for the computer vision before the OCR its essential for me that the OCR works well with OpenCV and its' classes.
Questions
- Why does
Pix
give a better result thanMat
, and vice versa? - How could the algorithm be changed to make the
Mat
variant as efficient as thePix
variant?
ocr.SetImage((uchar*)image.data, image.size().width, simageb.size().height, image.channels(), image.step1());
. By default imread would read colored image (even if it looks like black and white) so you're likely only giving tessereactblue
channel of Mat – Dmitrii Z.ocr.SetImage((uchar*)image.data, image.size().width, image.size().height, image.channels(), image.step1());
and now I get identical results! You should write your comment as an answer! :-) – Björn Larsson