Training the "tesseract ocr" with predefined font images

Question

I am trying to make OCR recognition of ASCII strings from the image. I am using Tesseract3 library, but I am having a little bit of the problem with correct recognition, so I need to train it with new character set (which is specific). I already discovered this HOW-TO: TrainingTesseract3, but the tutorial has some unnecessary procedures which I do not need, because of the simplicity of my image test set. My image data set comprises only 1 liners, where each of the ASCII characters is the same in all images (no rotation, no scaling), but has variable distance (only horizontal) between characters in the line.

How can I use font images to train the recognition algorithm?

Jeffrey Orcena Jeffrey Orcena · Accepted Answer · 2014-06-11T00:43:12

Sir just get what particular font you want to train then write letter or number in notepad (I think 5 reps/letter) save as tiff file. If you want to train it use any of this https://code.google.com/p/serak-tesseract-trainer/ or http://vietocr.sourceforge.net/training.html.

Training the "tesseract ocr" with predefined font images

1 Answers