0
votes

I am looking for a way to measure minimum, maximum, and average distance between the edges of polygons within the same planar projected shapefile (UTMs) using Program R. Is there a package out there that has these capabilities? So far I am coming up short, with the package "rgeos" being the closest with "gDistance", however this seems to only return the minimum distance between polygons. This is to include several measures of dispersion for polygons. As an example I have here the islands from the state of Hawaii. You can download the US states shapefile here. I then isolate and project the state of Hawaii:

library(rgdal)
library(rgeos)

states = readOGR(dsn=path.expand(),layer = "states")
hawaii = states[states$STATE_NAME == "Hawaii", ] 
hawaii = spTransform(hawaii, CRS("+proj=utm +zone=4 ellps=WGS84"))

I would like to measure the maximum distance between the edges of islands within the state of Hawaii, the minimum distance between island edges, and the mean distance between island edges. I have tried gDistance from rgeo, which should return minimum distance between 2 edges, but it currently returns zeros for me:

gDistance(hawaii)

Any ideas? Trying the following using the "byid" call just returns three 0s:

> gDistance(hawaii, hawaii, byid = TRUE)
  0
0 0

I hope to make this part of a for loop to assess proximity metrics for ~200 separate polygon shapefiles with multiple polygons in each file. I just need to calculate proximity within the polygons of each shapefile, not across different shapefiles. Thank you for the help.

1
If you run gDistance(hawaii, byID = TRUE), you should get a matrix showing the minimum distance between every pair of islands(polygons) in the dataset. This will be quite slow if there are many (small) islands in the dataset though.Jul
@Jul, coincidentally I just tried this, but it returns three 0s... I have edited my post to reflect this.Mo Correll
I just checked the shapefile you referenced. The individual islands do not have separate IDs. This means the gDistance function just sees Hawaii as one big object with the ID '0'. It returned a matrix indicating the distance between ID '0' and ID '0', 0 metres.Jul
I added an answer for the 'minimum' part of the your question, but for the 'maximum', can you confirm one thing please? Do you need the furthest distance across any two islands (e.g. the distance from eastern edge of Island 1 to the western edge of Island 2, presuming the islands are 'next to each other')?Jul

1 Answers

1
votes

Firstly, for the minimum, generally if you run gDistance(hawaii, byID = TRUE), you will get a matrix showing the minimum distance between every pair of islands(polygons) in the dataset. However, as discussed in the comments, for this to work, each island needs to have its own ID in the polygon file.

For Hawaii and the shapefile you specified, this approach will work for getting the minimum distance:

library(sp)    
hawaii_out <- disaggregate(hawaii)
gDistance(hawaii_out,byid = T)
              1         2         3         4         5         6         7
    1      0.00  26246.85 189520.49 299489.75 333273.01 367584.38 475015.98
    2  26246.85      0.00 117413.58 228699.22 263368.91 296349.18 406123.52
    3 189520.49 117413.58      0.00  41995.90  76905.51 110099.62 219964.68
    4 299489.75 228699.22  41995.90      0.00  13568.15  12738.74 129211.73
    5 333273.01 263368.91  76905.51  13568.15      0.00  14052.47 115235.51
    6 367584.38 296349.18 110099.62  12738.74  14052.47      0.00  46840.79
    7 475015.98 406123.52 219964.68 129211.73 115235.51  46840.79      0.00

(Of course you'll need to check which ID corresponds with which island).

This runs reasonably quickly for Hawaii and that shapefile, but this function can easily take a very long time if the number of islands (polygons) is high.

EDIT: Adding an approach that can identify and measure the most extreme points of an island pair

library(leaflet)
library(leaflet.extras)
library(sp)
library(tidyr)
library(dplyr)

start <- Sys.time()
##Need the original long/lat shapefile for this
hawaii_ll = states[states$STATE_NAME == "Hawaii", ] 
hawaii_out_ll <- disaggregate(hawaii_ll) ##Separate the islands

##Exact the original coordinates from the polygon file. Since we're dealing with straight line polygons, the furthest distance between any two islands should lie on a vertice.
IslandCoords <- list()
for(i in rownames(hawaii_out_ll@data)){
IslandCoords <- hawaii_out_ll@polygons[[as.numeric(i)]] %>% 
  slot("Polygons") %>% 
  .[[1]] %>% 
  slot("coords") %>% 
  cbind.data.frame(.,"Island"=i,"coord_ID"=paste0(i,"_",1:nrow(.)),stringsAsFactors=F) %>% 
  bind_rows(IslandCoords,.)
}
colnames(IslandCoords)[1:2] <- c("longitude","latitude")

##Double for loop will calculate max distance for each pair of islands in the dataset
all_res <- list() ##Initialise list for final results
for (island1 in unique(IslandCoords$Island)){
  temp_res <- list() ##List for temp results
for (island2 in unique(IslandCoords$Island)) {
  if(island1!=island2){   ##Avoid running the loop on the same island
##subset points to a single pair of islands
IslandCoordsTemp <- IslandCoords[IslandCoords$Island==island1|
                                   IslandCoords$Island==island2,]  
## Derive the convex hull (outermost points) for the island pair
IslandCoordsTemp <- IslandCoordsTemp[chull(IslandCoordsTemp[,1:2]),] 

##Calculate distance matrix between points, tidy it and order by distance
IslandTemp_scores <- spDists(as.matrix(IslandCoordsTemp[,1:2]),longlat=T) %>% 
  data.frame("coord_ID_A"=IslandCoordsTemp$coord_ID,
             "Island_A"=IslandCoordsTemp$Island,
             .) %>% 
  gather("coord_ID_B","distance",3:ncol(.)) %>% 
  arrange(desc(distance))

##Next two lines are to check and filter the data to ensure the maximum distance picked out is actually between points on differing islands
IslandTemp_scores$coord_ID_B <- IslandCoordsTemp$coord_ID[as.numeric(gsub("X","",IslandTemp_scores$coord_ID_B))] 
IslandTemp_scores$Island_B <- IslandCoordsTemp$Island[match(IslandTemp_scores$coord_ID_B,IslandCoordsTemp$coord_ID)]
IslandTemp_scores <- IslandTemp_scores %>% 
  filter(IslandTemp_scores$Island_A != IslandTemp_scores$Island_B) %>% 
  head(1)

##Place results in temp list
temp_res <- bind_rows(temp_res, 
            data.frame("Island1"=island1, 
           "Island2"=island2,
           "distance"=IslandTemp_scores$distance,
           stringsAsFactors = F))

##Use this to make sure the code is running as expected
print(paste(island1,island2))
}
}
  ##Bind all results into one data frame
  all_res <- bind_rows(all_res,temp_res)
 }

##Spread into matrix (if needed, just to match gDistance's appearance)
all_res_spread <- all_res %>% spread(key = Island2,value = distance,fill = 0)

Units are in km.

  Island1        1        2        3         4         5        6        7
1       1   0.0000 104.1285 272.4133 372.96831 374.27478 457.4984 624.7161
2       2 104.1285   0.0000 235.0730 334.42077 338.90971 420.2209 592.3716
3       3 272.4133 235.0730   0.0000 168.24874 174.68062 254.1973 430.2157
4       4 372.9683 334.4208 168.2487   0.00000  65.76585 143.4336 319.7396
5       5 374.2748 338.9097 174.6806  65.76585   0.00000 112.0591 283.6706
6       6 457.4984 420.2209 254.1973 143.43355 112.05911   0.0000 258.1099
7       7 624.7161 592.3716 430.2157 319.73960 283.67057 258.1099   0.0000

You can use leaflet and the addMeasures add-on from leaflet.extras to sense check the results.

##Can use this to sense-check/confirm the results
leaflet(hawaii_out_ll) %>% addPolygons(label = row.names(hawaii_out_ll@data)) %>% 
  addProviderTiles(providers$CartoDB) %>% addMeasure(primaryLengthUnit = "metre") %>% 
  addMarkers(data=IslandCoordsTemp)

Leaflet Measure