2
votes

I have code that traverses a grid of free and "blocked" positions; and finds the largest possible rectangle.

This works just fine. However, I need to be able to find rectangle with more specific critera. Consider the following:

00X0000000
000000####
##########
####000000
0000000000
0000000000

In this diagram; 0 marks available positions, # marks occupied positions, and X marks a desired point.

The code I have now will find the bottom-right area of the grid, as that is truly the largest rectangle. However, the answer I want contains the X position (the top left rectangle).

I do not know how to go about adding this extra criteria. I tried to also keep track of coordinates; throwing out results that did not contain X; but this did not work all the time. Under certain conditions, I would get rectangles that were smaller than desired (probably because of the order I was traversing the grid)

All of the algorithms I have found seem to only capable of finding the largest rectangle, and the coordinates of the rectangle. I have not been able to find any that would be able to exclude results that do not contain a specific point reliably.

Could anyone point me in the right direction, and/or provide an implementation?

EDIT: Since comments cannot have newlines, I need to use this area to explain problems with shole's implementation. It fails for cases such as these:

#0#X
###0
#000
####
0#00
0000
0000

In the above example, the coordinates returned form a rect like this:

0#X
##0
000

answer should have been 1x3, not 3x3

Another example of failure:

0#X#
0000
00##
000#
0000

returns:

#X
00

In this case the answer returned should have been a 1x2 area, instead it thinks it is 2x2.

In some cases, it returns areas that are larger than the area of the entire grid, with coordinates outside of the range of valid coords.

This works SOMETIMES, but it is more often incorrect than it is correct. It does always return an area that contains the point though; just the wrong area/coords.

1
Aren't the bottom two lines the largest rectangle ? The 'area' is 20 (number of zeroes). The bottom right has area 18.user1952500
You are correct. I was eyeballing it. The general idea though is that I don't want to return any of the bottom rectangles, just the one that has the X point in it.dmarra
You can start from X and go up, down, left and right (like a ripple) and expand until you get a '#'. The coordinates near the hash will be one of the end-points of such a maximal rectangle.user1952500
@user1952500 no...actually the end-points of the answer may not be near any hash...in OP's example, point (1,6) is a point of the answer but no where near hash...shole
@shole, I said that the coordinates near the hash will be one of the end-points, not the other way around. The rectangle which is the answer to the above will have the bottom-right corner next to a hash.user1952500

1 Answers

1
votes

Interesting problem, here is my O(R*C) algorithm with R = # rows, C = #columns

The idea is simple, brute force every point as potential rectangles' bottom border, then try to climb up as high as possible to obtain maximum height, then get the largest left and right distance of this vertical line

enter image description here

This tests all potential rectangles, but we have to do it fast. We do it by dynamic programming, row by row, for each (i,j), calculating the maximum height, maximum left and right distance it can reach.

Then we can test if X is inside this potential rectangle, if yes, I add a infinite large offset to the area of this rectangle, it is large enough that any rectangle DOES NOT contains X will be smaller than it, while other rectangles, if contains X and larger than it, we still can choose that rectangle instead.

After the end of the algorithm, we get the maximum Area + Offset, we subtract the offset from it and get the area back.

It's more easy to understand if you run the code (C++) and see the log below, with following input:

6 10
00X0000000
000000####
##########
####000000
0000000000
0000000000

#include<bits/stdc++.h>
using namespace std;

char w[20][20];
int tl[20][20] = {0}, tr[20][20] = {0}, h[20][20] = {0}, l[20][20]={0}, r[20][20]={0};
int ans, R,C, xr,xc, ar1, ac1, ar2, ac2; 

int largestRect()
{
    int area = 0;
    
    for(int i=0; i<20;i++) for(int j=0; j<20;j++) l[i][j] = r[i][j] = 1<<28;
    
   	for(int i=1; i<=R;i++){
   	    for(int j=1; j<=C;j++)
   		    if(w[i][j]!='#') tl[i][j] = tl[i][j-1]+1;
   			else tl[i][j] = 0;
 
   		for(int j=C; j>=1; j--)
   			if(w[i][j]!='#') tr[i][j] = tr[i][j+1]+1;
   			else tr[i][j] = 0;
   		
   		for(int j=1; j<=C; j++)
   			if(w[i][j]!='#') h[i][j] = h[i-1][j]+1;
   			else h[i][j] = 0;
   		
   		for(int j=1; j<=C; j++)
   			if(w[i][j] != '#') l[i][j] = min(tl[i][j], l[i-1][j]);
   		
   		for(int j=1; j<=C; j++)
   			if(w[i][j] != '#') r[i][j] = min(tr[i][j], r[i-1][j]);
   	
   		
   		for(int j=1; j<=C;j++){
   			int offset = 0;
   			if((r[i][j]+l[i][j]-1)*h[i][j] > 0){
   			    if(xr >= i-h[i][j]+1 && xr <= i && xc >= j-l[i][j]+1 && xc <= j+r[i][j]-1) offset = 1<<28;
   				printf("Top left = (%d,%d)  Bottom right = (%d,%d)\n", i-h[i][j]+1, j-l[i][j]+1,i, j+r[i][j]-1);
   				printf("Area = %d\n", (r[i][j]+l[i][j]-1)*h[i][j]);
   				printf("Included X? %d\n", xr >= i-h[i][j]+1 && xr <= i && xc >= j-l[i][j]+1 && xc <= j+r[i][j]-1 );
   				printf("Area with offset = %d\n\n", (r[i][j]+l[i][j]-1)*h[i][j] + offset);
   					
   				if(area <= (r[i][j]+l[i][j]-1)*h[i][j] + offset){
   					area = max(area, (r[i][j]+l[i][j]-1)*h[i][j] + offset);
   						ar1 = i-h[i][j]+1; ac1 = j-l[i][j]+1; ar2 = i; ac2 = j+r[i][j]-1;
   				}
   			}
   		}
   	}
   
    return area;
}

int main(){
	scanf("%d %d", &R, &C);
	
	for(int i=0; i<20;i++) for(int j=0; j<20;j++) w[i][j] = '#';
	
	for(int i=1; i<=R; i++){
		getchar();
		for(int j=1; j<=C; j++){
			w[i][j] = getchar();	
			if(w[i][j] == 'X'){ xr = i; xc = j;}
		} 
	}
	
	ans = largestRect() - (1<<28);

	printf("Largest Rect Containing X is (%d,%d) to (%d,%d), with area = %d\n", ar1, ac1, ar2, ac2, ans);
	return 0;	
}

EDITED PART: Algorithm / Code Explained in Details

First thing first, here is the outline of our O(R*C) algorithm:

  1. Brute force every empty point in grid, use it as a point of bottom border of our potential largest empty rectangle. It's O(R*C)
  2. For each point, I want to go upwards as far as possible, let's define h[i][j] be the array meaning the highest height you can reach starting at (i,j)
  3. Now I want to know how far I can go left and right, within this vertical segment range
  4. I get a height, I get the longest length of right and left I can reach within this height, so I can of course calculate this rectangle's area
  5. Get the maximum of all rectangle's area repeating step 1-4

Then we will see how to do step 2-3 in the algorithm, we will use Dynamic Programming.

Let's define several array:

h[i][j] := the highest height you can reach starting at (i,j)
tr[i][j] := the longest distance to right you can reach starting at (i,j)
tl[i][j] := the longest distance to left you can reach starting at (i,j)

One should be able to see these three arrays can be precompute using O(R*C)

h[i][j] = h[i-1][j]+1  if(i,j) is empty, else h[i][j] = 0
similarly
tr[i][j] = tr[i][j+1]+1 if(i,j) is empty, else tr[i][j] = 0
tl[i][j] = tl[i][j-1]+1 if(i,j) is empty, else tl[i][j] = 0

Now we define two more arrays

l[i][j] := the longest distance to right you can reach starting from 
line (i,j) and (i-h[i][j]+1,j) (That's simply just put h[i][j] into consideration)

Similar to r[i][j], please make sure you understand the difference between tl[i][j] and l[i][j] (tr[i][j] and r[i][j])

Now l[i][j] and r[i][j] can also be precomputed using O(R*C)!

A little difference is that they need h[i][j] and tr[i][j] (or tl[i][j]) to compute it,

so h[i][j], tl[i][j], tr[i][j] must be computed before l[i][j] and r[i][j].

l[i][j] = min(l[i-1][j], tl[i][j]) if(i,j) is empty, else l[i][j] = tl[i][j]
r[i][j] = min(r[i-1][j], tr[i][j]) if(i,j) is empty, else r[i][j] = tr[i][j]

You can think that it is saying the following statements:

l[i][j] is the minimum tl[X][j], X within [i-h[i][j]+1, i]

So l[i][j] is the maximum distance to left within (i,j) to its maximum height

Similar to r[i][j]

So my code is using two for loops doing exactly the Dynamic Programming described above, it's seems complicated because I did not precompute them all before using them

I compute these arrays and use them to calculate the rectangle's area on the fly. Being said, you can always precompute all of these arrays first, then calculate the maximum areas afterwards, the time complexity is still O(R*C)