I have a set of points and I want to extract values from several large rasters for a buffer around those points. The rasters are too large to be held in memory (> 1e10 cells). I illustrate my current approach below but would be interested if there is a faster approach.
library(maps)
library(sf)
library(raster)
library(dplyr)
library(parallel)
# sf object with polygones for which we want values
crs <- "+proj=utm +zone=33 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs"
map <- sf::st_as_sf( maps::map(regions = "Sweden", plot = FALSE, fill = TRUE))
map <- st_transform(map, crs = crs)
sf_points <- st_sfc(st_sample(map, 100))
sf_points <-
data.frame(A = 1:length(sf_points)) %>%
st_set_geometry(sf_points)
# raster too large to fit in memory
# the raster(s) I am working on has 10m resolution
r <- raster(extent(map), nrow = 15000, ncol = 7000,
crs = crs)
values(r) <- rep(sample(1:10, 77, replace = T), length.out = ncell(r))
#use the parallel package for parallel processing
cluster <- makeCluster(4)
clusterExport(cluster, c("r","sf_points", "as_Spatial"))
List_points <-
sf_points %>%
mutate(split = rep(1:ceiling(n()/4), each=4, length.out=n())) %>% # 4 cores
split(f = .$split) %>%
parLapply(cl = cluster, X = ., function(x) raster::extract(r, y = as_Spatial(x), buffer = 5000)) %>%
unlist(recursive = F)
I repeat the extraction for each raster. As the values are ordered I can then summarize the pixel values across lists. I cannot (easily) create a raster stack as the raster have different extent.
using the velox
package has sugested here doesn't seem to work as it tries to load the raster into memory which fails. I could try to load it chunk-wise but then I would need to figure out which points are on which chunk...