0
votes

I'm trying to use ggplot to plot the polygons of a shapefile (this one) in order to visualize some data about the regions contained therein. The code below produces an error:

library(rgdal)
library(tidyverse)
library(maptools)

hsa <- readOGR(dsn = "Data/hsa_bdry", layer = "HSA_Bdry") %>% fortify(region = "HSA93")

ggplot() + geom_polygon(data = hsa, aes(x = long, y = lat, group = group))

Error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point -103.18495682693303 24.986200983871445 at -103.18495682693303 24.986200983871445

How can I keep the data from the shapefile @data slot attached to the fortified data frame without getting plotting errors?

What I've tried so far:

If I remove the region argument from fortify it succeeds in producing a map, but I need the region id so I can merge on the rest of my data (to pass to aes arguments in ggplot).

I tried adding spTransform(CRS("+init=epsg:4269")) to my pipeline before calling fortify (the documentation for the shapefile says it uses North American Datum 1983, which is EPSG 4269) but then I get an error

Error in spTransform(xSP, CRSobj, ...) : No transformation possible from NA reference system

1

1 Answers

2
votes

As the error message indicates, there are invalid self-intersecting points in the shapefile, causing problems for fortify. You can pre-process the shapefile with gBuffer from rgeos package first:

# read in the shapefile as SpatialPolygonsDataFrame, but don't fortify it yet
hsa <- readOGR(dsn = "hsa_bdry", layer = "HSA_Bdry")

# check if there are invalid geometries & correct that
rgeos::gIsValid(hsa) #returns FALSE
hsa <- rgeos::gBuffer(hsa, byid = TRUE, width = 0)
rgeos::gIsValid(hsa) #returns TRUE

# fortify shouldn't encounter problems now.
hsa <- hsa %>% fortify(region = "HSA93")

Here's a example of the result, using basic theme options:

ggplot() + 
  geom_polygon(data = hsa, aes(x = long, y = lat, group = group),
               fill = "white", col = "black") +
  coord_map()

map

This post from GIS Stack Exchange discusses this issue too.

Edit: add visual inspection of invalid geometries before gBuffer (optional)

If you wish to inspect the invalid geometries before correcting them, you can view the shapefile & zoom in at the coordinates with reported problems. I usually prefer to do this in GIS software myself, but it's possible to plot the SpatialPolygonsDataFrame in R using base package's plot function.

# after loading in the shapefile but before any transformation
plot(hsa, col = "lightblue");axis(1);axis(2); title(main = "Full view")
points(x = -103.18495682693303, y = 24.986200983871445, 
       col = "red", cex = 5)

full view

# toggle xlim & ylim till we get a good zoom
plot(hsa, col = "lightblue",
     xlim = c(-103.185, -103.18),
     ylim = c(24.98, 24.99)); axis(1); axis(2); title(main = "Zoom in")
points(x = -103.18495682693303, y = 24.986200983871445, 
       col = "red", cex = 5)

We can see where the problem lies. The point where the two lines meet in that triangle overlap just slightly. It's minor enough that you probably won't notice it yourself, but it'll cause problems when we try to fortify it.

zoom in 1

After performing the gBuffer step, view the shapefile again (no change to the code except title):

zoom in 2