3
votes

I'm trying to develop a system which can convert a seven-segment display on an old analog pressure output system to text, such that the data can be processed by LabVIEW. I've been working on image processing to get Tesseract (using v3.02) to recognize the numbers correctly, but have been hitting some roadblocks and don't quite know how to proceed. This is what I've got so far:

  • Image needs to be a height of between 50-100 pixels for Tesseract to read it correctly. I've found the best results with a height of 50.
  • Image needs to be cropped such that there is only one line of text.
  • Image should be in black and white
  • Image should be relatively level from left to right.

I've been using the seven-segment training data 'letsgodigital'. This is the code for the image manipulation I've been doing so far:

  ret, i = video.read()
  h,width,channels = i.shape #get dimensions

  g = cv2.cvtColor(i,cv2.COLOR_BGR2GRAY)
  histeq=cv2.equalizeHist(g) #spreads pixel values across entire spectrum
  _,t = cv2.threshold(histeq,150,225,cv2.THRESH_BINARY) #thresholds histeq

  cropped = t[int(0.4*h):int(.6*h), int(0.1*width):int(0.9*width)]
  rotated = imutils.rotate_bound(cropped, angle)
  resized = imutils.resize(rotated,height=resizing_height)

Some numbers work better than others - for example, '1' seems to have a lot of trouble. The numbers occurring after the '+' or '-' often don't show up, and the '+' often shows up as a '-'. I've played around with the threshold values a bit, too.

The last three parts are because my video sample I've been drawing from was slightly askew. I could try taking some better data to work with, and I could also try making my own training data over the standard 'letsgodigital' lang. I feel like I'm not doing the image processing in the best way though, and would appreciate some guidance.

I plan to use some degree of edge detection to autocrop to the display, but for now I've just been trying to keep it simple and manually get the results I want. I've uploaded sample images with various degrees of image processing applied at http://imgur.com/a/vnqgP. It's difficult because sometimes I get the exact right answer from tesseract, and other times get nothing. The camera or light levels haven't really changed though, which makes me think it's a problem with my training data. Any suggestions or direction on where I should go would be much appreciated!! Thank you

1
I'm by no means an expert (only ever used tesseract-ocr in one project of mine), but in my experience tesseract can handle text like yours quite easily. I've made it read unusual fonts on structured backgrounds and the results were still reasonable. That's why I think your images are perfectly fine, and you should probably focus on getting better training data.Aran-Fey

1 Answers

1
votes

For reading seven segment digits, normal OCR programs like tesseract don't usually work too well because of the space between individual segments. You should try ssocr, which was made specifically for reading seven segment digits. However, your preprocessing will need to be better as ssocr expects the input to be a single row of seven segment digits.

References - https://www.unix-ag.uni-kl.de/~auerswal/ssocr/

Usage example - http://www.instructables.com/id/Raspberry-Pi-Reading-7-Segment-Displays/