1
votes

Good afternoon,

I am writing an ocr program to detect text on images. So far I am getting good results but when text is black and background is white. What can I do to improve images that have white text on light colored background (yellow, green, etc)?

One original example image could be: original image

So far I am just converting it to grey_scale using:

image = image.convert('L')

Then apply a series of filters like for example: SHARPEN SMOOTH BLUR etc

Then i do binarization like this:

image = image.point(lambda x: 0 if x<128 else 255, '1') #refers to http://stackoverflow.com/questions/18777873/convert-rgb-to-black-or-white and also to http://stackoverflow.com/questions/29923827/extract-cow-number-from-image

My outoup images are indeed very bad for ocr feeding like this one: Output

What am I doing wrong? What should be the best approach for white text on light colored background?

Another doubt: is my binarization step to strong/exagerated?

Should I mix some filters? Could you suggest some?

PS: I am a total newbie to image processing, so please keep it simple =x

Thanks so much for your attention and help/advices.

1

1 Answers

2
votes

I tried this with ImageMagick, which has Python bindings too - except I did it at the command line. I guess you can adapt what I did quite readily - I don't speak Pythonese nor use PIL but hopefully it will give you some insight as to a possible avenue.

convert http://i.stack.imgur.com/2cFk3.jpg -fuzz 50% -fill black +opaque white -threshold 50% x.png

Basically it takes any colour that is not within 50% of white and fills it with black, then it thresholds the result to pure black and white.

enter image description here

Another option would be to threshold the image according to the saturation of the colours. So, you convert to HSB colorspace, separate the channels and discard the hue and brightness. You are then left with the saturation which you threshold as follows:

convert http://i.stack.imgur.com/2cFk3.jpg -colorspace hsb -separate -delete 0,2 -threshold 50% x.png

Throw in a -negate to get white letters on black.

enter image description here

I have copied some other code for PIL, and am modifying it kind of/sort of to something that may be close to what you need - bear in mind I know no Python:

import colorsys
from PIL import Image
im = Image.open(filename)
ld = im.load()
width, height = im.size
for y in range(height):
    for x in range(width):
        r,g,b = ld[x,y]
        h,s,v = colorsys.rgb_to_hsv(r/255., g/255., b/255.)

        if s>0.5:                     // <--- here onwards is my attempted Python
           ld[x,y] = (0,0,0)
        else:
           ld[x,y] = (255,255,255)