Find best matching overlapping polygons in R

Question

I have two shapefiles that I have read into R using readOGR() as SpatialPolygonsDataFrame objects. Both are maps of New Zealand with different internal boundaries. One has about 70 polygons representing territorial authority boundaries; the other has about 1900 representing area units.

My aim - an annoyingly basic part of a bigger project - is to use these maps to produce a reference table that can look up an area unit and return which territorial authority it is mostly in. I can use over() to find the which polygons overlap, but in many cases area units seem to be, at least in small part, within multiple territorial authorities - even though looking at individual cases suggests that normally 90%+ of an area unit is in a single territorial authority.

Is there a ready to hand means that does what over() does but which can identify not just (or not even) all the overlapping polygons, but which of the several overlapping polygons is the most overlapping in each case?

Peter Ellis Peter Ellis · Accepted Answer · 2013-01-09T05:58:13

Here is code that did the job, drawing on @Silverfish's answer

library(sp)
library(rgeos)
library(rgdal)

###
# Read in Area Unit (AU) boundaries
au <- readOGR("C:/Users/Peter Ellis/Documents/NZ", layer="AU12")

# Read in Territorial Authority (TA) boundaries
ta <- readOGR("C:/Users/Peter Ellis/Documents/NZ", layer="TA12")

###
# First cut - works ok when only one TA per area unit
x1 <- over(au, ta)
au_to_ta <- data.frame(au@data, TAid = x1)

###
# Second cut - find those with multiple intersections
# and replace TAid with that with the greatest area.

x2 <- over(au, ta, returnList=TRUE)

# This next loop takes around 10 minutes to run:
for (i in 1:nrow(au_to_ta)){
    tmp <- length(x2[[i]])
    if (tmp>1){
        areas <- numeric(tmp)
        for (j in 1:tmp){
            areas[j] <- gArea(gIntersection(au[i,], ta[x2[[i]][j],]))
            }
#       Next line adds some tiny random jittering because
#       there is a case (Kawerau) which is an exact tie
#       in intersection area with two TAs - Rotorua and Whakatane

        areas <- areas * rnorm(tmp,1,0.0001)

        au_to_ta[i, "TAid"] <- x2[[i]][which(areas==max(areas))]
    }

}


# Add names of TAs
au_to_ta$TA <- ta@data[au_to_ta$TAid, "NAME"]

####
# Draw map to check came out ok
png("check NZ maps for TAs.png", 900, 600, res=80)
par(mfrow=c(1,2), fg="grey")
plot(ta, col=ta@data$NAME)

title(main="Original TA boundaries")
par(fg=NA)
plot(au, col=au_to_ta$TAid)
title(main="TA boundaries from aggregated\nArea Unit boundaries")
dev.off()

enter image description here

Find best matching overlapping polygons in R

2 Answers