extract() data from raster with small polygons - rounded weights too small

Question

Using R, I am trying to extract data from a raster layer using a polygon layer. The polygons are much smaller than the raster cells:

enter image description here

Now I call extract() from raster library:

a <- extract(raster, polygons, weights = TRUE, small = TRUE)
a
# ...
# [[1551]]
# value weight
#   209   0.03 # top left cell - more than 50% of the polygon area

There are two problems - the weight is the proportion of the cell area covered by the polygon, and the weights are rounded to 1/100. In my case, only the top left cell is present in the output (value 209) - the weight of 3 other cells was rounded to zero and they were excluded. However, the bottom left cell covers significant proportion of the polygon and should be included also!

I need a proper weighted mean. Can this be done somehow else using `extract()`? Or any other way?

PS: note aside: I think the weights in extract() are not designed very well - the weight should be the proportion of polygon area covered by the particular cell, not vice versa. Then, the weighted mean for the polygon would be also easier to compute (just multiply the two numbers in each row and sum up), and rounding to 1/100 wouldn't be a big problem.

Reproducible example - (download the files - simplified version, actual data are much bigger):

require(raster)
rast <- raster("my.tif")
poly <- readOGR(".", "socc_buff_Rx")
a <- extract(rast, poly, weights = TRUE, small = TRUE)
a

Related: Extract in R fails for small polygons and raster

Can't you expand raster resolution to capture (or approach) the covered area proportion? — Paulo E. Cardoso
@PauloCardoso thanks - this was actually one of my 4 ideas how to solve problem: 1) persuade extract to sane behaviour, 2) convert raster cells to polygons, 3) read raster in R cell by cell, in fact implementing my own extract, 4) increase the raster resolution by further splitting the cells (your idea). So it depends which solution is in fact easiest and also computationally more efficient. And I don't know yet how to implement these solutions... — Tomas

Robert Hijmans Robert Hijmans · Accepted Answer · 2013-12-22T02:27:23

I think the easiest, albeit inelegant, solution is to disaggregate the RasterLayer first. I will have a look to see if I can change the extract function to do this automatically for very small (relative to the cells size) polygons.

library(raster)
r <- raster("my.tif")
pu <- shapefile("socc_buff_Rx.shp")
p <- spTransform(pu, crs(r))

extract(r, p, weights = TRUE, small = TRUE)
#[[1]]
# value weight
#   209   0.03

rr <- disaggregate(r, 10)
e <- extract(rr, p, weights = TRUE, small = TRUE)
lapply(e, function(x) { aggregate(x[,2,drop=F], list(value=x[,1]), sum ) } )

#[[1]]
#  value weight
#1   197   0.95
#2   209   3.44
#3   256   0.31
#4   293   0.04

plot(r, legend=F)
plot(p, add=T)
text(r)

raster cells with values

extract() data from raster with small polygons - rounded weights too small

I need a proper weighted mean. Can this be done somehow else using extract()? Or any other way?

1 Answers

I need a proper weighted mean. Can this be done somehow else using `extract()`? Or any other way?