I have a series of images, each containing a word. Instead of running pytesseract OCR on all of the images separately (which works fine), I would like to compile the images into one large image and run pytesseract OCR on that (to lower runtime).
What is the best way to format the images to get the best results? (ie: should they be lined up horizontally, vertically, jumbled, etc.)
Also, what would be the best page segmentation mode?
I have tried horizontally concatenating the images and then using PSM 7 (treating the image as a single line of text), however, this did not produce results as good as running pytesseract OCR on each individual word image using PSM 8 (treating the image as a single word).