14
votes

I'm trying to develop simple PC application for license plate recognition (Java + OpenCV + Tess4j). Images aren't really good (in further they will be good). I want to preprocess image for tesseract, and I'm stuck on detection of license plate (rectangle detection).

My steps:

1) Source Image

True Image

Mat img = new Mat();
img = Imgcodecs.imread("sample_photo.jpg"); 
Imgcodecs.imwrite("preprocess/True_Image.png", img);

2) Gray Scale

Mat imgGray = new Mat();
Imgproc.cvtColor(img, imgGray, Imgproc.COLOR_BGR2GRAY);
Imgcodecs.imwrite("preprocess/Gray.png", imgGray);

3) Gaussian Blur

Mat imgGaussianBlur = new Mat(); 
Imgproc.GaussianBlur(imgGray,imgGaussianBlur,new Size(3, 3),0);
Imgcodecs.imwrite("preprocess/gaussian_blur.png", imgGaussianBlur);  

4) Adaptive Threshold

Mat imgAdaptiveThreshold = new Mat();
Imgproc.adaptiveThreshold(imgGaussianBlur, imgAdaptiveThreshold, 255, CV_ADAPTIVE_THRESH_MEAN_C ,CV_THRESH_BINARY, 99, 4);
Imgcodecs.imwrite("preprocess/adaptive_threshold.png", imgAdaptiveThreshold);

Here should be 5th step, which is detection of plate region (probably even without deskewing for now).

I croped needed region from image (after 4th step) with Paint, and got:

plate region

Then I did OCR (via tesseract, tess4j):

File imageFile = new File("preprocess/adaptive_threshold_AFTER_PAINT.png");
ITesseract instance = new Tesseract();
instance.setLanguage("eng");
instance.setTessVariable("tessedit_char_whitelist", "acekopxyABCEHKMOPTXY0123456789");
String result = instance.doOCR(imageFile); 
System.out.println(result);

and got (good enough?) result - "Y841ox EH" (almost true)

How can I detect and crop plate region after 4th step? Have I to make some changes (improvements) in 1-4 steps? Would like to see some example implemented via Java + OpenCV (not JavaCV).
Thanks in advance.

EDIT (thanks to @Abdul Fatir's answer) Well, I provide working (for me atleast) code sample (Netbeans+Java+OpenCV+Tess4j) for those who interested in this question. Code is not the best, but I made it just for studying.
http://pastebin.com/H46wuXWn (do not forget to put tessdata folder into your project folder)

3
You could try analyzing the contours. However it might be more reliable to use a cascade classifier to locate the license plate (test your algorithm with a white car and see how it works). Deskew the plate so it's horizonal. You should also add an additional phase before tesseract -- segment the license plate into individual characters (vertical projection will probably work well given the quality of your image) and only feed those to tesseract..Dan Mašek
Can you post the image after step 4 as well? I think you should be able to detect the plate-border by extracting contours and filter them on size and h/w-ratio. If you have the contour (since you know it is a rectangle, you can undo the projection transformation)RobAu
@RobAu, Yeah sure: i.imgur.com/chrNMYX.pngDocC

3 Answers

13
votes

Here's how I suggest you should do this task.

  1. Convert to Grayscale.
  2. Gaussian Blur with 3x3 or 5x5 filter.
  3. Apply Sobel Filter to find vertical edges.

    Sobel(gray, dst, -1, 1, 0)

  4. Threshold the resultant image to get a binary image.
  5. Apply a morphological close operation using suitable structuring element.
  6. Find contours of the resulting image.
  7. Find minAreaRect of each contour. Select rectangles based on aspect ratio and minimum and maximum area.
  8. For each selected contour, find edge density. Set a threshold for edge density and choose the rectangles breaching that threshold as possible plate regions.
  9. Few rectangles will remain after this. You can filter them based on orientation or any criteria you deem suitable.
  10. Clip these detected rectangular portions from the image after adaptiveThreshold and apply OCR.

a) Result after Step 5

Result after Step 5

b) Result after Step 7. Green ones are all the minAreaRects and the Red ones are those which satisfy the following criteria: Aspect Ratio range (2,12) & Area range (300,10000)

c) Result after Step 9. Selected rectangle. Criteria: Edge Density > 0.5

enter image description here

EDIT

For edge-density, what I did in the above examples is the following.

  1. Apply Canny Edge detector directly to input image. Let the cannyED image be Ic.
  2. Multiply results of Sobel filter and Ic. Basically, take an AND of Sobel and Canny images.
  3. Gaussian Blur the resultant image with a large filter. I used 21x21.
  4. Threshold the resulting image using OTSU's method. You'll get a binary image
  5. For each red rectangle, rotate the portion inside this rectangle (in the binary image) to make it upright. Loop through the pixels of the rectangle and count white pixels. (How to rotate?)

Edge Density = No. of White Pixels in the Rectangle/Total no. of Pixels in the rectangle

  1. Choose a threshold for edge density.

NOTE: Instead of going through steps 1 to 3, you can also use the binary image from step 5 for calculating edge density.

2
votes

Actually OpenCV has pre-trained model specially for Russian license plates: haarcascade_russian_plate_number

Also there is open source ANPR project for Russian license plates: plate_recognition. It is not use tesseract, but it has quite good pre-trained neural network.

1
votes
  • You find all connected components (the white areas) and determine their outline.
  • If you filter them based on size (as part of the image), ratio (width-height) and white/black ratio to retrieve candidate-plates.
  • Undo the transformation of the rectangle
  • Remove the bolts
  • Pass in image to the OCR engine.