my goal is to count the frequency of extreme weather events within subnational African regions. To do so, I have set up a shapefile containing African provinces, using mostly GADM data and the new geocoded EMDAT GDIS dataset for point-data on weather events.
This is how the region shapefile looks like:
library(sf)
st_geometry(africa_map)
Geometry set for 796 features
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: -25.36042 ymin: -46.96575 xmax: 63.49391 ymax: 37.3452
Geodetic CRS: WGS 84
First 5 geometries:
MULTIPOLYGON (((-4.821226 24.99475, -4.821355 2...
MULTIPOLYGON (((1.853562 35.8605, 1.8424 35.865...
MULTIPOLYGON (((-1.361976 35.3199, -1.358957 35...
MULTIPOLYGON (((2.984874 36.81497, 3.014171 36....
MULTIPOLYGON (((7.262677 37.076, 7.266449 37.07..
And the GDIS dataset after converting longitude and latitude to WGS 84:
gdis_africa_sf <- st_as_sf(x = gdis_africa,
coords = c("longitude", "latitude"),
crs = "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
st_geometry(gdis_africa_sf)
Geometry set for 5171 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -34.04233 ymin: -25.19619 xmax: 37.08849 ymax: 63.4228
CRS: +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
First 5 geometries:
POINT (-17.09348 15.66576)
POINT (-16.53153 15.77399)
POINT (-16.20006 15.84419)
POINT (-17.09348 15.66576)
POINT (-16.53153 15.77399)
By now, you can already tell that something's off because the bounding boxes do not correspond at all, even though the projections seem to fit.
st_crs(africa_map)==st_crs(gdis_africa_sf)
[1] TRUE
When plotting the two next to each other, the issue becomes clearer, no matter if I use the new shapefile or just apply longitude and latitude of the data frame.
ggplot() +
geom_sf(data = africa_map) +
geom_sf(data = gdis_africa_sf)
ggplot(data = africa_map) +
geom_sf() +
geom_point(data = gdis_africa, aes(x = longitude, y = latitude),
color = "red",
alpha = 0.3,
size = 2,
shape = 1)
It seems like the weather event coordinates are shifted some thousand kilometers to the North West - but what's the source? And how can I fix the issue and make my two geographical data compatible? Any hints would be much appreciated.