19
votes

I'm trying to scan a constant size image and locate the drawn rectangles in it. The rectangles can come in any size, but only red colored.

This is not where the problem starts.

I'm gonna use an already written function, and I will use it as pseudo code calls later on my code logic.

Rectangle Locate(Rectangle scanArea); // scans for a rectangle in a given scan area. if no rectagle is found,returns null.

My logic was like this:

Find a first initial red rectangle using the Locate() function with the full image size as an argument. Now, divide the rest areas, and keep scanning recursively. The main point in this algorithm's logic is that you never check a checked already area, and you don't have to use any condition because always the scanArea parameter is a new area which you haven't scanned before (and that's thanks to the division technique). The division process is done like this: right area of the current found rectangle, the bottom area, and the left area.

Here's an image which illustrates that process. enter image description here (The white dotted rectangles and the yellow arrows are not part of the image, I've added them only for the illustration.) As you seen, once a red rectangle found, I keep scanning the right of it, bottom and left. Recursively.

So here's the code for that method:

List<Rectangle> difList=new List<Rectangle>();

private void LocateDifferences(Rectangle scanArea)
{
    Rectangle foundRect = Locate(scanArea);
    if (foundRect == null)
        return; // stop the recursion.

    Rectangle rightArea = new Rectangle(foundRect.X + foundRect.Width, foundRect.Y, (scanArea.X + scanArea.Width) - (foundRect.X + foundRect.Width), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define right area.
    Rectangle bottomArea = new Rectangle(foundRect.X, foundRect.Y + foundRect.Height, foundRect.Width, (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define bottom area.
    Rectangle leftArea = new Rectangle(scanArea.X, foundRect.Y, (foundRect.X - scanArea.X), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define left area.

    difList.Add(rectFound);

    LocateDifferences(rightArea);
    LocateDifferences(bottomArea);
    LocateDifferences(leftArea);
}

So far everything works alright, it does find every red rectangle. But sometimes, the rectangles saved as few rectangles. For a reason that obvious for me: overlapping rectangles case.

A problematic case for example: enter image description here

Now, in this case, the program finds the first red region as planned, but then, since the right area starts only in the middle of the full second region, it does not scan from the beginning of the second red rectangle!

In a similar way, I can divide the areas so the bottom area stretches from the start of scanArea to the end, which would be like this: enter image description here But now we would have a problem when scanning overlapping rectangles on the right and on the left of the foundRect rectangle, for example, in this kind of case:

enter image description here I need to get each rectangle in one piece only. I would like to get any help or suggestion combined with my code logic - because it works great but just needs a little one or two additional conditions in the recursion method I think. I'm not sure what to do and I would really appreciate any help.

If something isn't clear enough, just tell and I'll explain it as best as I can! Thanks!

Of course, this is not the real problem I'm facing against, it is just a little demonstration which could help me solve the real problem that I'm working on (which is a real-time internet project).

10
The basic (though perhaps not most efficient) solution would be to find the split rectangles and then do some post processing to merge them together. For example, for each rectangle see see if there's another where the top right and bottom right corners align to the top left and bottom left of another. Merge them together. Then find where bottom right and bottom left match the top right and top left and merge those together. Put the merged rectangle back into your collection. Then repeat until no such matches are found.pinkfloydx33
@pinkfloydx33 to do a nested loop? Becaue I have to compare each region to every other regions in the list.. did you mean to this?Slashy
Work on a copy of the image and flood-fill every rectangle with the background colour after you've found it, then start checking again from the point where you first found the rectangle? (Or is checking a large part of the image n+1 times to find n rectangles too time-consuming?)m69 ''snarky and unwelcoming''
This is poor logic, you'd be better off writing a completely different algorithm. Working around flawed logic is never the right thing to do. It'll cost more time to implement and it'll be slower than an algorithm that is correct from the start.Cris Luengo
@m69 of course! I just wrote a small explaination how the Locate function works: finding the first(most top-left) red pixel, then it keeps scanning to the right,and then when it finds no more red pixels, it start scanning the height ,keep moving down-easy and simple. It works very fast due to the native methods (such as memcmp()) used there. The scan function is limited to the scan area parameter. if you have any other suggestion... or if you want to see the full function feel free to ask! I just don't think the bottleneck is there, so i haven't posted itSlashy

10 Answers

9
votes

An algorithm which can find multiple rectangles by scanning an image once doesn't have to be complicated. The main difference with what you're doing now is that when you find the top corner of a rectangle, you shouldn't immediately find the width and height and store the rectangle, but instead keep it in a list of unfinished rectangles temporarily, until you come across its bottom corner. This list can then be used to efficiently check whether each red pixel is part of a new rectangle, or one you've already found. Consider this example:

enter image description here

We start scanning top-to-bottom and left-to-right. In line 1, we find a red pixel at position 10; we continue scanning until we find the next black pixel (or reach the end of the line); now we can store it in a list of unfinished rectangles as {left,right,top}:

unfinished: {10,13,1}  

When scanning the next line, we iterate over the list of unfinished rectangles, so we know when to expect a rectangle. When we reach position 10, we find a red pixel as expected, and we can skip to position 14 and iterate past the unfinished rectangle. When we reach position 16 we find an unexpected red pixel, and continue to the first black pixel at position 19, and then add this second rectangle to the unfinished list:

unfinished: {10,13,1},{16,18,2}  

After we've scanned lines 3 to 5, we get this list:

unfinished: {1,4,3},{6,7,3},{10,13,1},{16,18,2},{21,214}  

Note that we insert newly found rectangles while we're iterating over the list (using e.g. a linked list), so that they are in order from left to right. That way, we only ever have to look at one unfinished rectangle at a time while we're scanning the image.

On line 6, after passing the first two unfinished rectangles, we find an unexpected black pixel at position 10; we can now remove the third rectangle from the unfinished list, and add it to an array of complete rectangles as {left,right,top,bottom}:

unfinished: {1,4,3},{6,7,3},{16,18,2},{21,21,4}  
finished: {10,13,1,5}  

When we reach the end of line 9, we've completed all the rectangles that were unfinished after line 5, but we've found a new rectangle on line 7:

unfinished: {12,16,7}  
finished: {10,13,1,5},{16,18,2,5},{1,4,3,6},{6,7,3,8},{21,21,4,8}  

If we continue to the end, the result is:

unfinished:  
finished: {10,13,1,5},{16,18,2,5},{1,4,3,6},{6,7,3,8},{21,21,4,8},  
          {12,16,7,10},{3,10,10,13},{13,17,13,14},{19,22,11,14}  

If there are any unfinished rectangles left at this point, they border the bottom edge of the image, and can be completed with bottom=height-1.

Note that skipping through unfinished rectangles means you only have to scan the black pixels and the top and left edge of the red rectangles; in the example we skipped over 78 of the 384 pixels.

Click [here] to see a simple C++ version in action on rextester.com (sorry, I don't speak C#).

(Rextester seems to be hacked at the moment, so I've removed the link and pasted the C++ code here.)

#include <vector>
#include <list>
#include <iostream>

struct rectangle {int left, right, top, bottom;};

std::vector<rectangle> locate(std::vector<std::vector<int>> &image) {
    std::vector<rectangle> finished;
    std::list<rectangle> unfinished;
    std::list<rectangle>::iterator current;
    int height = image.size(), width = image.front().size();
    bool new_found = false;                  // in C++17 use std::optional<rectangle>new_rect and check new_rect.has_value()

    for (int y = 0; y < height; ++y) {
        current = unfinished.begin();        // iterate over unfinished rectangles left-to-right
        for (int x = 0; x < width; ++x) {
            if (image[y][x] == 1) {          // red pixel (1 in mock-up data)
                if (current != unfinished.end() && x == current->left) {
                    x = current->right;      // skip through unfinished rectangle
                    ++current;
                }
                else if (!new_found) {       // top left corner of new rectangle found
                    new_found = true;
                    current = unfinished.insert(current, rectangle());
                    current->left = x;
                }
            } else {                         // black pixel (0 in mock-up data)
                if (new_found) {             // top right corner of new rectangle found
                    new_found = false;
                    current->right = x - 1; current->top = y;
                    ++current;
                }
                else if (current != unfinished.end() && x == current->left) {
                    current->bottom = y - 1; // bottom of unfinished rectangle found
                    finished.push_back(std::move(*current));
                    current = unfinished.erase(current);
                }
            }
        } // if there is still a new_rect at this point, it borders the right edge
    } // if there are unfinished rectangles at this point, they border the bottom edge
    return std::move(finished);
}

int main() {
    std::vector<std::vector<int>> image { // mock-up for image data
        {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
        {0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0},
        {0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,1,1,1,0,0,0,0,0},
        {0,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,1,1,0,0,0,0,0},
        {0,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,1,1,0,0,1,0,0},
        {0,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,1,1,0,0,1,0,0},
        {0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0},
        {0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,0},
        {0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,0},
        {0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0},
        {0,0,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,0,0,0,0,0,0},
        {0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0},
        {0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0},
        {0,0,0,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,0},
        {0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,1,1,1,1,0},
        {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
    };

    std::vector<rectangle> rectangles = locate(image);
    std::cout << "left,right,top,bottom:\n";
    for (rectangle r : rectangles) {
        std::cout << (int) r.left << "," << (int) r.right << "," << (int) r.top << "," << (int) r.bottom << "\n";
    }

    return 0;
}

If you find that C#'s linked list implementation is not fast enough, you could use two arrays of length image width, and when you find the top of a rectangle between positions x1 and x2 on line y, store the incomplete rectangle as width[x1]=x2-x1 and top[x1]=y, and reset them to zero when you store the complete rectangle.

This method will find rectangles as small as 1 pixel. If there is a minimum size, you can scan the image using greater steps; with a minimum size of 10x10 you'd only have to scan around 1% of the pixels.

4
votes

Simplest approach to use simple algorithm like:

function find(Image): Collection of Rects
   core_rect = FindRects(Image)
   split(core_rect) -> 4 rectangles (left-top, left-bottom, right-top, right-bottom)
   return Merge of (find(leftTop), find(leftBottom), ...)

function findAll(Image): Collection of Rects
   rects <- find(Image)
   sort rectangles by X, Y
   merge rectangles
   sort rectangles by Y, X
   merge rectangles
   return merged set

Merging of two rectangles should be fairly simple - they should have shared border. But given approach would work only in case image contains rectangles and only rectangles. In case of more complex geometric figures might be better to use line by line scanning algorithm for area detection and on the next stage shape type identification.

4
votes

Based on your requisites:

  • Leaving the function Locate(Rectangle scanArea) untouched.
  • Using a recursive algorithm for scanning Left / Bottom / Right (fig).

scanAlg

I’d introduce an extra argument of type Side to the recursive function.

internal enum Side : byte 
{
    Left,
    Bottom,
    Right
}

Say we use Bottom as the “cutting” direction, we could then boost efficiency (of reassembling the cut rectangles) by creating a wrapper that stores extra information for the rectangles found in the bottomAreas.

internal class RectangleInfo
{
    public RectangleInfo(Rectangle rect, bool leftOverlap, bool rightOverlap)
    {
        Rectangle = rect;
        LeftOverlap = leftOverlap;
        RightOverlap = rightOverlap;
    }
    public Rectangle Rectangle { get; set; }
    public bool LeftOverlap { get; set; }
    public bool RightOverlap { get; set; }
}

For faster lookup you could also divide the cut rectangles found in leftAreas and rightAreas over separate lists. Which would turn your sample code into something like:

List<Rectangle> difList = new List<Rectangle>();

List<Rectangle> leftList = new List<Rectangle>();
List<RectangleInfo> bottomList = new List<RectangleInfo>();
List<Rectangle> rightList = new List<Rectangle>();

private void AccumulateDifferences(Rectangle scanArea, Side direction)
{
    Rectangle foundRect = Locate(scanArea);
    if (foundRect == null)
        return; // stop the recursion.

    switch (direction)
    {
        case Side.Left:
            if (foundRect.X + foundRect.Width == scanArea.X + scanArea.Width)
                leftList.Add(foundRect);
            else difList.Add(foundRect);
            break;

        case Side.Bottom:
            bottomList.Add(new RectangleInfo(foundRect, foundRect.X == scanArea.X, foundRect.X + foundRect.Width == scanArea.X + scanArea.Width));
            break;

        case Side.Right:
            if (foundRect.X == scanArea.X)
                rightList.Add(foundRect);
            else difList.Add(foundRect);
            break;
    }
    Rectangle leftArea = new Rectangle(scanArea.X, foundRect.Y, (foundRect.X - scanArea.X), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define left area.
    Rectangle bottomArea = new Rectangle(foundRect.X, foundRect.Y + foundRect.Height, foundRect.Width, (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define bottom area.
    Rectangle rightArea = new Rectangle(foundRect.X + foundRect.Width, foundRect.Y, (scanArea.X + scanArea.Width) - (foundRect.X + foundRect.Width), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); //define right area.

    AccumulateDifferences(leftArea, Side.Left);
    AccumulateDifferences(bottomArea, Side.Bottom);
    AccumulateDifferences(rightArea, Side.Right);
}

private void ProcessDifferences()
{
    foreach (RectangleInfo rectInfo in bottomList)
    {
        if (rectInfo.LeftOverlap)
        {
            Rectangle leftPart =
                leftList.Find(r => r.X + r.Width == rectInfo.Rectangle.X
                                   && r.Y == rectInfo.Rectangle.Y
                                   && r.Height == rectInfo.Rectangle.Height
                             );
            if (leftPart != null)
            {
                rectInfo.Rectangle.X = leftPart.X;
                leftList.Remove(leftPart);
            }
        }

        if (rectInfo.RightOverlap)
        {
            Rectangle rightPart =
                rightList.Find(r => r.X == rectInfo.Rectangle.X + rectInfo.Rectangle.Width
                                    && r.Y == rectInfo.Rectangle.Y
                                    && r.Height == rectInfo.Rectangle.Height
                              );
            if (rightPart != null)
            {
                rectInfo.Rectangle.X += rightPart.Width;
                rightList.Remove(rightPart);
            }
        }

        difList.Add(rectInfo.Rectangle);
    }

    difList.AddRange(leftList);
    difList.AddRange(rightList);
}

private void LocateDifferences(Rectangle scanArea)
{
    AccumulateDifferences(scanArea, Side.Left);
    ProcessDifferences();

    leftList.Clear();
    bottomList.Clear();
    rightList.Clear();
}

Finding adjacent rectangles

It’s possible that multiple rectangles exist with the same X values in rightList (or X + Width values in leftList), therefore we need to verify the intersection when a possible match is found.

Depending on the number of elements you could also use dictionaries (for faster lookup) in case of leftList and rightList. Using the top intersection point as a key and then checking the Heights before merging.

4
votes

Following your criteria of not changing the Locate() function but just extending on your existing logic, we need to join any rects post scan. Try this:

First modify your LocateDifferences() function slightly to keep track of rectangles that may need to be joined.

private void LocateDifferences(Rectangle scanArea)
{
    Rectangle foundRect = Locate(scanArea);
    if (foundRect == null)
        return; // stop the recursion.

    Rectangle rightArea = new Rectangle(foundRect.X + foundRect.Width, foundRect.Y, (scanArea.X + scanArea.Width) - (foundRect.X + foundRect.Width), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); //define right area.
    Rectangle bottomArea = new Rectangle(foundRect.X, foundRect.Y + foundRect.Height, foundRect.Width, (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define bottom area.
    Rectangle leftArea = new Rectangle(scanArea.X, foundRect.Y, (foundRect.X - scanArea.X), (scanArea.Y + scanArea.Height) - (foundRect.Y + foundRect.Height)); // define left area.

    if (foundRect.X == scanArea.X || foundRect.Y == scanArea.Y || (foundRect.X + foundRect.Width == scanArea.X + scanArea.Width) || (foundRect.Y + foundRect.Height == scanArea.Y + scanArea.Height)) 
    {
        // edge may extend scanArea
        difList.Add(Tuple.Create(foundRect, false));
    } else {
        difList.Add(Tuple.Create(foundRect, true));
    }

    LocateDifferences(rightArea);
    LocateDifferences(bottomArea);
    LocateDifferences(leftArea);
}

I've also added these two methods for use:

// JoinRects: will return a rectangle composed of r1 and r2.
private Rectangle JoinRects(Rectangle r1, Rectangle r2)
{
    return new Rectangle(Math.Min(r1.X, r2.X), 
                    Math.Min(r1.Y, r2.Y), 
                    Math.Max(r1.Y + r1.Width, r2.Y + r2.Width), 
                    Math.Max(r1.X + r1.Height, r2.X + r2.Height));
}

// ShouldJoinRects: determines if the rectangles are connected and the height or width matches.
private bool ShouldJoinRects(Rectangle r1, Rectangle r2)
{
    if ((r1.X + r1.Width + 1 == r2.X && r1.Y == r2.Y && r1.Height == r2.Height)
     || (r1.X - 1 == r2.x + r2.Width && r1.Y == r2.Y && r1.Height == r2.Height)
     || (r1.Y + r1.Height + 1 == r2.Y && r1.X == r2.X && r1.Width == r2.Width)
     || (r1.Y - 1 == r2.Y + r2.Height && r1.X == r2.X && r1.Width == r2.Width))
    {
        return true;
    } 
    else 
    {
        return false;
    }
}

Finally your main function that kicks off the scan

List<Tuple<Rectangle, Bool>> difList = new List<Tuple<Rectangle, Bool>();

// HERE: fill our list by calling LocateDifferences
LocateDifferences();

var allGood = difList.Where(t => t.Item2 == true).ToList();
var checkThese = difList.Where(t => t.Item2 == false).ToArray();

for (int i = 0; i < checkThese.Length - 1; i++)
{
    // check that its not an empty Rectangle
    if (checkThese[i].IsEmpty == false) 
    {
        for (int j = i; j < checkThese.Length; j++)
        {
            // check that its not an empty Rectangle
            if (checkThese[j].IsEmpty == false) 
            {
                if (ShouldJoinRects(checkThese[i], checkThese[j])
                {
                    checkThese[i] = JoinRects(checkThese[i], checkThese[j]);
                    checkThese[j] = new Rectangle(0,0,0,0);
                    j = i // restart the inner loop in case we are dealing with a rect that crosses 3 scan areas
                }
            }
        }
        allGood.Add(checkThese[i]);
    }
}

//Now 'allGood' contains all the rects joined where needed.
3
votes

I would solve it in the following way:

  1. I will start reading the image from the first pixel.
  2. register the location (x,y) of each red pixel. [put 1 at (x,y) in a result matrix that has the same size of image]
  3. cost is O(nxm) where n is the number of rows and m is the number of columns of the image.
  4. a rectangle is a collection of connected 1s where sum(y) is the same for each x. This is a necessary check to ensure capturing rectangles only in case there were blobs/circles (green segment in the image below) ..etc.

Below is a photo of the result matrix: nxm result matrix

3
votes

There is no need to reinvent the wheel. This is a connected component labeling problem.

https://en.wikipedia.org/wiki/Connected-component_labeling

There are various ways to address it. One is by run-length coding the image rows and finding the overlaps from row to row.

Another is by scanning the image and performing a flood fill every time you meet a red pixel (you erase the whole rectangle).

Yet another is by scanning the image and performing an outline following when you meet a red pixel (and you mark every outline pixel so that the blob is n more processed).

All these methods will work for arbitrary shapes, and you can adapt them to the specific shape of your rectangles.

2
votes

Take a look at this post. Similar problem was solved there. I propose to use flood fill algorithm to detect the rectangles.

2
votes

Based on the clarifying comments, your existing method is the perfect starting point, just in my opinion it should operate using an auxiliary bitmap containing which pixels should not be checked (at all, or again).

Assuming the majority of image is not red:

  1. clear auxiliary bitmap to 0
  2. set x=y=0, starting position for scanning
  3. scan image from x,y, proceed in storage order for efficiency, locate the first red pixel on the image where auxiliary bitmap is 0
  4. that is the corner of a rectangle, find its dimensions using the existing approach (start a horizontal and a vertical scan, etc.)
  5. record the rectangle (x,y,w,h)
  6. fill rectangle (x,y,w,h) in auxiliary bitmap with 1-s
  7. x+=w+1, continue from 2 (implying that it will check if the new x position is still inside image dimensions and try from 0,y+1 if necessary, and also recognize if scanning is just done)

If most of the image is covered with red rectangles, I would swap the check in 3 (check the auxiliary bitmap first, and then see if pixel is red), and extend the filling step (6) with one pixel to the left, right and bottom directions (pixels towards the top have been checked already)

Personally I believe more in the cache-efficiency of reading neighboring pixels in memory-order, than in jumping around (due to the partitioning idea), but still visiting most pixels plus having to join a potentially large number of fragment-rectangles at the end.

1
votes

I apologize but I didn't read your solution because I am not sure if you want a good solution or to fix the problem with that solution.

A simple solution using exiting building blocks (like OpenCV which I don't know if have a port to c#) is:

  1. take the red channel (because you said you want to detect only red rectangles)
  2. findContours
  3. for each contour 3.1 take its bounding box 3.2 check if the contour is rectangle by comparing its ttotal area to the total area of the bounding box.

the solution will change depends on the variety of your input images. I Hope I help you. if not please direct me to what kind of help you want.

1
votes

You do a line scan per pixel, over an image.

if pixel up is black and pixel left is black but pixel itself is red then you have left upper corner (x1,y1). go to the right till its black again (that's right top y2+1) Goto bottom to find black thats x2+1 so you can derive right, bottom (x2,y2)

Store x1,y1,x2,y2 in a list struct or class paint your just found rectangle black, and continou line scan