7
votes

I've managed over the past few months to teach myself PHP, PDO & SQL, and have built a basic dynamic website with user registration/email activation/ and login logout functionality, following PHP/SQL best practices. Now I'm stuck on the next task...

I've created a huge dataset of squares/polygons (3 million+), each 1 minute of latitude & longitude in size, stored in a PHP array with a single set of coordinates (the top left corner). To extrapolate a square-like shape, I simply add 0.016 degrees (~1 minute) to each direction and generate the other 3 coordinates.

I now need to check that each polygon in said array is over at least some portion of land in the United States.... i.e. if one were to produce a graphical output of my completed data set and take a look at the San Fransisco coastline, they'd see something like this.

It's similar to the point-in-polygon problem, except it's dealing with another polygon instead of a point, the other polygon is a country border, and I'm not just looking at intersections. I want to check if:

  • A polygon/square intersects with the polygon. (Think coastline/border).
  • A polygon/square is inside the polygon. (Think continental U.S.).
  • A polygon/square contains part of the polygon. (Think small island).

This is illustrated with my crudely drawn image:

This is not easy!

If it matches any of these three conditions, I want to keep the square. If it does not interact with the big polygon in anyway (i.e. it's over water), discard it.

I was thinking the big polygon would be a shapefile of the U.S., that or a KML file which I could strip the coordinates out of to create a very complex polygon from.

Then, I thought I'd pass these matching squares and square ID's over to a csv file for integration into a MySQL table containing a set of coordinates of each square (in fact, I'm not even sure of the best practices for handling tables of that size in MySQL, but I'll come to that when need be). The eventual goal would then be to develop a map using Google Maps API via Javascript to display these squares over a map on the website I'm coding (obviously only showing squares within the viewpoint to make sure I don't tax my db to death). I'm pretty sure I'd have to pass such information through PHP first, too. But all of that seems relatively easy compared to the task of actually making said data set.

This is obviously something that cannot be done by hand, so it needs automating. I know a bit of Python, so would that be of help? Any other tips on where to start? Someone willing to write some of the code for me?

2
A square in what projection? Mercator? Note that equal squares in Mercator projection have different size on different latitudes. And note that in Geographic projection you would have different number of squares on each latitude. Also, consider saving coordinates of one corner only. You could calculate the others as needed.Olexa
Probably Mercator, I believe this is what Google Maps is coded in. Good point about the coordinates... but how can I generate the polygons in the first place?marked-down
Google uses 'special' odd projection, similar to Mercator: alastaira.wordpress.com/2011/01/23/…Olexa
A rectangle intersects a polygon if any point of the polygon is within the rectangle, or if any corner of the rectangle is within a polygon. See this for checking whether a point is within a polygon: stackoverflow.com/questions/217578/…. Checking whether a point is within a rectangle is much easier, of course (just compare coordinates).Olexa
can you show us what you have so far? as in the format of the data you have already? are the squares in one array and polys in another? etclollercoaster

2 Answers

3
votes

Here is a solution that will be efficient, and as simple as possible to implement. Note that I do not say simple, but as simple as possible. This is a tricky problem, as it turns out.

1) Get U.S. polygon data using Shapefiles or KFL, which will yield a set of polygon shapes (land masses), each defined by a list of vertices.

2) Create a set of axis aligned bounding box (AABB) rectangles for the United States: one for Alaska and each Alaskan island, one for each Hawaiian island, one for the Continental United States, and one for each little island off the coast of the Continental U.S. (e.g., Bald Head Island in N.C., Catalina off the coast of California). Each bounding box is defined as a rectangle with the corners which are the minimum and maximum latitude and longitude for the shape. My guess is that there will be a few hundred of these. For example, for Hawaii's big island, the latitude runs 18°55′N to 28°27′N, and the longitude runs 154°48′W to 178°22′W. Most of your global lat/long pairs get thrown out at this step, as they are not in any of those few hundred bounding boxes. For example, your bounding box at 10°20'W, 30°40'N (a spot in the Atlantic Ocean near Las Palmas, Africa) does not overlap Hawaii, because 10°20'W is less than 154°48′W. This bit would be easy to code in Python.

3) If the lat/long pair DOES overlap one of the several hundred AABB rectangles, you then need to test it against the single polygon within the AABB rectangle. To do this it is strongly recommended to use the Minkowski Difference (MD). Please thoroughly review this website first:

http://www.wildbunny.co.uk/blog/2011/04/20/collision-detection-for-dummies/

In particular, look at the "poly versus poly" demo halfway down the page, and play with it a little. When you do, you will see that when you take the MD of the 2 shapes, if that MD contains the origin, then the two shapes are overlapping. So, all you need to do then is take the Minkowski Difference of the 2 polygons, which itself results in a new polygon (B - A, in the demo), and then see if that polygon contains the origin.

4) There are many papers online regarding algorithms to implement MD, but I don't know if you'll have the ability to read the paper and translate that into code. Since it is tricky vector math to take the MD of the two polygons (the lat/long rectangle you're testing, and the polygon contained in the bounding box which overlapped the lat/long rectangle), and you have told us that your experience level is not high yet, I would suggest using a library that already implements MD, or even better, implements collision detection.

For example:

http://physics2d.com/content/gjk-algorithm

Here, you can see the relevant pseudo-code, which you could port into Python:

if aO cross ac > 0 //if O is to the right of ac
    if aO dot ac > 0 //if O is ahead of the point a on the line ac
        simplex = [a, c]
        d =-((ac.unit() dot aO) * ac + a)
    else // O is behind a on the line ac
        simplex = [a]
        d = aO
else if ab cross aO > 0 //if O is to the left of ab
    if ab dot aO > 0 //if O is ahead of the point a on the line ab
        simplex = [a, b]
        d =-((ab.unit() dot aO) * ab + a)
    else // O is behind a on the line ab
        simplex = [a]
        d = aO
else // O if both to the right of ac and to the left of ab
    return true //we intersect!

If you are unable to port this yourself, perhaps you could contact either of the authors of the 2 links I've included here--they both implemented the MD algorithm in Flash, perhaps you could license the source code.

5) Finally, assuming that you've handled the collision detection, you can simply store in the database a boolean as to whether the lat/long pair is part of the United States. Once that's done, I have no doubt you will be able to do as you'd like with your Google Maps piece.

So, to sum up, the only difficult piece here is to either 1) implement the collision detection GJK algorithm, or alternatively, 2) write an algorithm that will first calculate the Minkowski Difference between your lat/long pair and the land polygon contained within your AABB and then secondly see if that MD polygon contains the origin. If you use that approach, Ray Casting (typical point-in-a-polygon solution) would do the trick with the second part.

I hope this gives you a start in the right direction!

3
votes

I think this other question answers a good portion of what you are trying to do

How do I determine if two convex polygons intersect?

The other portion is that if you are using a database, I would load in all polygons near your view point from both sets (the set of the map polygons and the set of other polygons you generated) and then run the above algorithm on this smaller set of polygons and you can generate a list of all polygons in your set that should be overlayed on the map.