I am currently writing a C++ program that should read hex data from JPEG images. I have to compile it into one single windows executable without any external resources (like the "tessdata" directory or config files). As I am not reading any words or sentences, I don't need any dictionaries or languages.
My problem is now that I could not find a way to initialize the API without any language files. Every example uses something like this:
tesseract::TessBaseAPI api;
if (api.Init(NULL, "eng")) {
// error handling
return -1;
}
// do stuff
I also found that I can call the init function without language argument and with OEM_TESSERACT_ONLY
:
if(api.Init(NULL, NULL, tesseract::OcrEngineMode::OEM_TESSERACT_ONLY)) {
// ...
}
This should disable the language/dictionary, but NULL
just defaults to "eng". It seems like tesseract still wants a language file to initialize and will disable it afterwards.
This also seems to be the case for any other solutions I found so far: I always need .traineddata files to initialize the api and can disable them afterwards or using config files.
My question is now: Is there any way to initialize the tesseract API in C++ using just the executable and no other resource files?