5
votes

I made this post in other forum as well, but since I really need a reply, i post it here once more.

I am working in R and want to calculate the value of a polygon derived from the intersecting cells of a raster. The value should consider the weights on each intersecting cell. When I try to run the "extract" function with a sample raster and polygon I get different weights that the ones I calculate manually, resulting in different final value.

Here is my sample code:

require(raster)
r <- raster(nrow=2, ncol=2, xmn=-180, xmx=60, ymn=-30, ymx=90)   
r[] <- c(1,2,4,5)    
s <- raster(xmn=-120, xmx=-40, ymn=20, ymx=60, nrow=1, ncol=1)    
s.pl <- as(s, 'SpatialPolygons')    
w <- raster::extract(r, s.pl, method="simple",weights=T, normalizeWeights=F)    
mean.value <- raster::extract(r, s.pl, method="simple",weights=T, fun=mean) 

The value I get is 2.14 but according to the actual weights of the cells it should be 2. More specifically for every part of the polygon that intersects with different cell the data are:

 Area Value
 1800 1
  600 2
  600 4
  200 5

So the final value of the polygon based on the above should be 2.

Can it be because of the projection that is in lat/lon? But even when I assign projection in meters I have the same result. How can I get the value of 2 that I am interested in? I also tried with "resample" function but I get different results as well.

My final target is to create a new raster with different resolution and extents from the original one and assign the values based on the weights of the cells of the original raster that intersect with the cells of the new raster. But it seems that neither resample nor extract functions give the expected outcome.

2
What other forum? Please link to the question, if you must cross-post.Roman Luštrik
I suspect you're expecting something which extract is unable to deliver. If you read ?extract, you will notice " If y represents polygons, the extract method returns the values of the cells of a Raster* object that are covered by a polygon. A cell is covered if its center is inside the polygon (but see the weights option for considering partly covered cells;". Perhaps you could coerce your raster to a polygon (rasterToPolygons) and find areas of overlap using rgeos package?Roman Luštrik
Thanks for the reply Roman, hvala lepa :) I read about the y polygons, but i though that with the weight option active is would give the desired result. Could you maybe me more specific with the rgeos package?NikMas

2 Answers

4
votes

Let us assume we have a raster A and two SpatialPolygon objects [B, C] that are not rectangular (hexagons in this case). For demonstrating purposes the center of hexagon B is defined to be the center of our raster A (see left plot below). Hexagon C is shifted to the right along the horizontal axis.

require(raster)
require(scales)

A    <- raster(nrow=2, ncol=2, xmn=-180, xmx=180, ymn=-180, ymx=180)
A[]  <- c(1,2,4,5)    
A.pl <- as(A, 'SpatialPolygons')
B    <- SpatialPolygons(list(Polygons(list(Polygon(cbind(c(0, 100, 100, 0, -100, -100, 0), 
                                                         c(100, 50, -50, -100, -50, 50, 100)))), 'B')))
C    <- SpatialPolygons(list(Polygons(list(Polygon(cbind(c(40, 140, 140, 40, -60, -60, 40), 
                                                         c(100, 50, -50, -100, -50, 50, 100)))), 'C')))

Object B

Since hexagon B is in the center, the weights should all equal 0.25. We can easily derive from the plot that the area of the hexagons is 30000 (imagine a square that the hexagon fits in (40000) and substract 2 rectangulars (-10000), each consisting of 2 of the 4 corners you have to cut off). Therefore, each intersection area is of size 7500 and 7500/30000 = 0.25

# get intersections
intsct.B <- raster::intersect(B, A.pl)
intsct.C <- raster::intersect(C, A.pl)

### B
area.B <- B@polygons[[1]]@area

weights <- unlist(lapply(intsct.B@polygons, function(x) {
  slot(x, 'area')/area.B
}))
weights
> [1] 0.25 0.25 0.25 0.25

Now we get the value of the cells that each intersection polygon falls in and compute the mean.

vals    <- unlist(lapply(intsct.B@polygons, function(x) { 
  extract(A, data.frame(t(slot(x, 'labpt'))))
}))

sum(weights * vals)
> [1] 3

As we would expect, the mean of c(1, 2, 4, 5) is 3.

Hexagon B and its intersections with A

Object C

Now lets do the same with object C

### C
area.C <- C@polygons[[1]]@area

weights <- unlist(lapply(intsct.C@polygons, function(x) {
  slot(x, 'area')/area.C
}))
weights
> [1] 0.13 0.37 0.13 0.37

vals    <- unlist(lapply(intsct.C@polygons, function(x) { 
  extract(A, data.frame(t(slot(x, 'labpt'))))
}))

sum(weights * vals)
> [1] 3.24

Again, as we would expect the mean is bigger (since the weights for the cells with values 2 and 5 are higher). Also, since we shifted the hexagon along one axis only, it makes sense that 2 weights occur twice.

Hexagon C and its intersections with A

Raster with higher number of cells

The next plot shows the intersections of B (left hand side) and C (rhs) with a 4x4 raster with values c(1:8, 10:17). For B there are 12 intersections and for C 8. Notice again that the mean for B is exactly 9 because of the symmetry.

enter image description here

This should work for any SpatialPolygons object. Be sure to use the same CRS for the objects you throw into intersect.

0
votes

Here is what I managed to do based on the replies of this post.

require(raster)
require(rgeos)
r <- raster(nrow=2, ncol=2, xmn=-180, xmx=60, ymn=-30, ymx=90)    
r[] <- c(1,2,4,5)    
r <- stack(r, r*2, r^2)
s <- raster(xmn=-120, xmx=-40, ymn=20, ymx=60, nrow=1, ncol=1)    
s.pl <- as(s, 'SpatialPolygons')    
r.s <- as(r, 'SpatialPolygonsDataFrame')
pi1 <- gIntersection(r.s, s.pl, byid = T)
areas1 <- data.frame(area=sapply(pi1@polygons, FUN=function(x) {slot(x, 'area')}))
row.names(areas1) <- sapply(pi1@polygons, FUN=function(x) {slot(x, 'ID')})
areas1$Pol.old <- as.numeric(vapply(strsplit(rownames(areas1), " "), `[`, 1, FUN.VALUE=character(1)))
areas1$pol.new <- as.numeric(vapply(strsplit(rownames(areas1), " "), `[`, 2, FUN.VALUE=character(1)))
f <- r.s@data
seqs <- match(areas1$Pol.old, rownames(f))
ar <- cbind(areas1, f[seqs,])
ar[,-(1:3)] <- ar[,-(1:3)]*ar$area
f <- aggregate.data.frame(ar, by=list(ar$pol.new), FUN=sum)
f[,-(1:4)] <- f[,-(1:4)]/f$area  
ar.v <- as.matrix(f[, -c(1:4)])
s2 <- stack(s)
s1 <- setValues(s2, ar.v)

If somebody can suggest nicer and/or faster code, please let me know, cause I don't like my approach very much.