2
votes

I wrote a digital OCR for ios. I have a test image png with two digits 5 and 4. I find the contours. How do I transfer the contour one at tesseract?

init tesseract:

    tess = new tesseract::TessBaseAPI();
    tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
    tess->SetPageSegMode(tesseract::PSM_SINGLE_CHAR); //<-- !!!!
    tess->tesseract::TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");

Function for detect contours:

- (std::vector<std::vector<cv::Point> >)findSquaresInImage:(cv::Mat)_image {
std::vector<std::vector<cv::Point> > squares;
cv::Mat pyr, timg, gray0(_image.size(), CV_8U), gray;
int thresh = 50, N = 11;
cv::pyrDown(_image, pyr, cv::Size(_image.cols/2, _image.rows/2));
cv::pyrUp(pyr, timg, _image.size());
std::vector<std::vector<cv::Point> > contours;
    int ch[] = {0, 0};
    mixChannels(&timg, 1, &gray0, 1, ch, 1);
    for( int l = 0; l < N; l++ ) {
        if( l == 0 ) {
            cv::Canny(gray0, gray, 0, thresh, 5);
            cv::dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
        }
        else {
            gray = gray0 >= (l+1)*255/N;
        }
        cv::findContours(gray, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
        std::vector<cv::Point> approx;

        CvRect rec1;
        std::string str;
        std::map<int,IplImage*> pic_list;

        for( size_t i = 0; i < contours.size(); i++ )
        {

            rec1 = cv::boundingRect(contours[i]);

            if (rec1.height > 0.5*gray.rows && rec1.width < 0.756*gray.cols) {
                NSLog(@"%d %d %d %d", rec1.width, rec1.height, rec1.x, rec1.y);
                cv::approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
                squares.push_back(approx);
            }
        }
    }

return squares;  }

function for draw contours:

cv::Mat debugSquares( std::vector<std::vector<cv::Point> > squares, cv::Mat image ) {
for ( int i = 0; i< squares.size(); i++ ) {
    // draw contour
    cv::drawContours(image, squares, i, cv::Scalar(255,0,0), 1, 8, std::vector<cv::Vec4i>(), 0, cv::Point());

    // draw bounding rect
    cv::Rect rect = boundingRect(cv::Mat(squares[i]));
    cv::rectangle(image, rect.tl(), rect.br(), cv::Scalar(0,255,0), 2, 8, 0);

    // draw rotated rect
    cv::RotatedRect minRect = minAreaRect(cv::Mat(squares[i]));
    cv::Point2f rect_points[4];
    minRect.points( rect_points );
    for ( int j = 0; j < 4; j++ ) {
        cv::line( image, rect_points[j], rect_points[(j+1)%4], cv::Scalar(0,0,255), 1, 8 ); // blue
    }
}

return image;
}

method for btn Click:

- (IBAction)onMath:(id)sender {
    UIImage *image = [UIImage imageNamed:@"test1.png"];

    cv::Mat iMat = [self cvMatFromUIImage:image];
    std::vector<std::vector<cv::Point> > sq = [self findSquaresInImage:iMat];
    cv::Mat hui = debugSquares(sq, iMat);

    image = [self UIImageFromCVMat:hui];
    self.imView.image = image;
}

image after:

link to project on github: https://github.com/MaxPatsy/iORC

1
You could use SetImage and then SetRectangle with just the contour bounding box; do you know how to give tesseract an image it can read?Kaolin Fire
Can you update your question? The github user/project was deleted with no trace on Internet Archive. Best related link I could find is cyberforum.ru/ios-dev/thread788840.htmlCœur

1 Answers

0
votes

Can you check this answer here

I described some tips for preparing images for Tesseract here: Using tesseract to recognize license plates

In your example, there are several things going on...

You need to get the text to be black and the rest of the image white (not the reverse). That's what character recognition is tuned on. Grayscale is ok, as long as the background is mostly full white and the text mostly full black; the edges of the text may be gray (antialiased) and that may help recognition (but not necessarily - you'll have to experiment)

One of the issues you're seeing is that in some parts of the image, the text is really "thin" (and gaps in the letters show up after thresholding), while in other parts it is really "thick" (and letters start merging). Tesseract won't like that :) It happens because the input image is not evenly lit, so a single threshold doesn't work everywhere. The solution is to do "locally adaptive thresholding" where a different threshold is calculated for each neighbordhood of the image. There are many ways of doing that, but check out for example:

Adaptive gaussian thresholding in OpenCV with cv2.adaptiveThreshold(...,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,...) Local Otsu's method Local adaptive histogram equalization Another problem you have is that the lines aren't straight. In my experience Tesseract can handle a very limited degree of non-straight lines (a few percent of perspective distortion, tilt or skew), but it doesn't really work with wavy lines. If you can, make sure that the source images have straight lines :) Unfortunately, there is no simple off-the-shelf answer for this; you'd have to look into the research literature and implement one of the state of the art algorithms yourself (and open-source it if possible - there is a real need for an open source solution to this). A Google Scholar search for "curved line OCR extraction" will get you started, for example:

Text line Segmentation of Curved Document Images Lastly: I think you would do much better to work with the python ecosystem (ndimage, skimage) than with OpenCV in C++. OpenCV python wrappers are ok for simple stuff, but for what you're trying to do they won't do the job, you will need to grab many pieces that aren't in OpenCV (of course you can mix and match). Implementing something like curved line detection in C++ will take an order of magnitude longer than in python (* this is true even if you don't know python).

Good luck!