5
votes

Im trying to join two sf data-frames using inner-join or left-join. These dataframes both have geometries columns inside. I keep getting the error:
Error in check_join(x, y) : y should be a data.frame; for spatial joins, use st_joinFALSE

Reproducible example below:

df1 <- data.frame(
  var = c("a", "b", "c"),
  lon1 = c(20,35,45),
  lat1 = c(50,10,15)
) %>% st_as_sf(coords = c("lon1", "lat1"), dim = "XY") %>%
  st_set_crs(4326)

df2 <- data.frame(
  var = c("a", "b", "c"),
  lon2 = c(15,25,35),
  lat2 = c(5,10,15)
) %>% st_as_sf(coords = c("lon2", "lat2"), dim = "XY") %>%
  st_set_crs(4326)

df <- inner_join(df1, df2, by = "var")

I wouldn`t like to drop the geometry because I think that can mess up my results afterwards, but any solution is welcome

1
You example does not make sense. df1 and df2 are strictly identical. Either you want to join the attribute tables and then you can drop de geometries of one dataset or you want to merge by location and then you need to use st_join as explained in the error message. Another possibility is to create intersections between geometries (st_intersection). It depends on want you want exactly do (probably the first option) - Gilles

1 Answers

9
votes

If you just want to do a non-spatial join but carry the geometry columns forward, you can 'deactivate' them first (e.g. as.data.frame()), join, then 'reactivate' the geometry column that you want to be active. You now have two sfc columns in your data.frame, but 'geometry.x' is the active one.

df <- inner_join(df1 %>% as.data.frame(), df2 %>% as.data.frame(), by = "var")

df %<>% st_sf(sf_column_name = 'geometry.x')

> str(df)
Classes ‘sf’ and 'data.frame':  3 obs. of  3 variables:
 $ var       : Factor w/ 3 levels "a","b","c": 1 2 3
 $ geometry.x:sfc_POINT of length 3; first list element: Classes 'XY', 'POINT', 'sfg'  num [1:2] 20 50
 $ geometry.y:sfc_POINT of length 3; first list element: Classes 'XY', 'POINT', 'sfg'  num [1:2] 15 5
 - attr(*, "sf_column")= chr "geometry.x"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA
  ..- attr(*, "names")= chr  "var" "geometry.y"

> st_crs(df)
    Coordinate Reference System:
      EPSG: 4326 
      proj4string: "+proj=longlat +datum=WGS84 +no_defs"