
The link below use Matlab to remove non-text content from an image. I want to do the same thing with OpenCV in Java.

I don't have a Matlab to try with and I am new to OpenCV. Though I know some basics about the theory behind the process, but it's kind of difficult to make the translation from Matlab language into OpenCV 3.0. And preferably in Java.


ADD 1 - region detection with MSER (not resolved yet)

For MSER detection, I can use the following code to detect the MSER keypoints.

public static void MSERdetector(String imgName1, String suffix1) {
    Mat imgMat1 = Imgcodecs.imread(picDir + imgName1 + "." + suffix1, Imgcodecs.CV_LOAD_IMAGE_GRAYSCALE);
    String outImgName1 = picDir + "MSER" + "_keypoints_" + imgName1 + "_"   + ".tif";
    Mat outImg1 = new Mat();        

FeatureDetector featureDetector = FeatureDetector.create(FeatureDetector.MSER); // create the feature detector

MatOfKeyPoint keypoints1 = new MatOfKeyPoint();
featureDetector.detect(imgMat1, keypoints1);

if (!keypoints1.empty()) {
    Features2d.drawKeypoints(imgMat1, keypoints1, outImg1);
    Imgcodecs.imwrite(outImgName1, outImg1);
else {
    System.out.println("No keypoints found for: " + imgName1);


But I don't know how to convert keypoints into regions. What I need is below:

But I don't know how to convert keypoints into regions. What I need is below:

enter image description here

ADD 2 - Canny edges and intersection with MSER regions (not resolved yet)

Once I am able to find the MSER regions, I am supposed to intersect it with Canny edges. I can find some Canny edges as below. But I don't know how to do the intersection operation.

public static void CANNYedge(String imgName1, String suffix1) {
    Mat imgMat1 = Imgcodecs.imread(picDir + imgName1 + "." + suffix1, Imgcodecs.CV_LOAD_IMAGE_GRAYSCALE);
    //imgMat1 = ImageUtilities.Convert2BW(imgMat1);
    String outImgName1 = picDir + "_CANNY_" + imgName1 + ".tif";
    Mat outImg1 = new Mat();
    Imgproc.Canny(imgMat1, outImg1, 0, 500);
    Imgcodecs.imwrite(outImgName1, outImg1);

My canny edges output looks like this:

ADD 3 - Now I turned to use VS 2013 Community

For setting up OpenCV with VS2013, check here.

ADD 4 - Coding in VC++ 2013

Below is what I tried for now with reference to here.

//Step2: Detect MSER regions
Mat grayImage;
cvtColor(colorImage, grayImage, CV_BGR2GRAY);
imshow("Gray Image", grayImage);

Ptr<MSER> mserExtractor = MSER::create(); // create MSER extractor with default parameters. http://code.opencv.org/projects/opencv/wiki/MSER http://docs.opencv.org/master/d3/d28/classcv_1_1MSER.html#a49d72a1346413106516a7fc6d95c09bb
//Mat mserOutMask = Mat::zeros(grayImage.rows, grayImage.cols, CV_8UC3);

Mat vis;
//vis = Mat::zeros(grayImage.rows, grayImage.cols, CV_8UC3);

vector<vector<Point>> mserContours;
vector<Rect> mserBBox;//what's this?
mserExtractor->detectRegions(grayImage, mserContours, mserBBox);

for (int i = 0; i<mserContours.size(); i++)
    drawContours(vis, mserContours, i, Scalar(255, 255, 255), 4);

imshow("MSER by contours", vis);

Mat vis2;
for (vector<cv::Point> v : mserContours){
    for (cv::Point p : v){
        vis2.at<uchar>(p.y, p.x) = 255;
imshow("MSER by points", vis);

What I got are these:

vis1 - MSER by contours

vis2 - MSER by points


I just experimented with the text detection sample as suggested by Miki. It requires some trained model files to run. And it took almost 2 minutes to finish but we can leave the performance later. My scenario is to OCR texts from complex screenshots (sorry to reveal that until now). Though the result is quite good for natural scenes. It is not so appealing for screenshots. Below is the result:

enter image description here

1 Answers


Posting as an answer just to show the result of the OpenCV text detection example

enter image description here

Now you need to apply text recognition, using for example OCRHMMDecoder

You'll find a sample here