If your data of points to query on consists of a data.frame of x and y coordinates and the appropriate species name for the layer to query on you can use these two commands to do everything:
# Find the layer to match on using 'grepl' and 'which' converting all names to lowercase for consistency
df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) )
# Extract each value from the appropriate layer in the stack
df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) )
How it works
Starting from the first line:
- First we define a new column vector
df$layer
which will be the index of the rasterLayer
in the stack that we need to use for that row.
lapply
iterates along all the elements in the column df$species
and applies an anonymous function using each item in df$species
as an input variable x
in turn. lapply
is a loop construct even though it doesn't look like one.
- on the first iteration we take the first element of
df$species
which is now x
and use it in grepl
(means something like 'global regular pattern matching logical') to find which elements of the names of our stack s
contain our species pattern. We use tolower()
on both the pattern to match against (x
) and the elements to match in (names(s)
) to ensure we match even when the case doesn't match case, e.g. "Tiger"
won't find "tiger"
.
grepl
returns a logical vector of which elements it found matches of the pattern in, e.g. grepl( "abc" , c("xyz", "wxy" , "acb" , "zxabcty" ) )
returns F , F , T , T
. We use which
to get the index of those elements.
- The idea is that we get one, and only one match of a layer in the stack to the species name for each row, so the only
TRUE
index will be the index of the layer in the stack we want.
On the second line, sapply
:
sapply
is an iterator much like lapply
but it returns a vector rather than a list of values. TBH you could use either in this use-case.
- Now we iterate across a sequence of numbers from
1
to nrow(df)
.
- We use the row number in another anonymous function as our input variable
x
- We want to extract the
"x"
and "y"
coordinates (columns 1 and 2 respectively) for the current row (given by x
) of the data.frame, using the layer that we got in our previous line.
- We assign the result of doing all this to another column in our data.frame which contains the extracted value for that
x/y
coord for the appropriate layer
I hope that helps!!
And a worked example with some data:
require( raster )
# Sample rasters - note the scale of values in each layer
# Tens
r1 <- raster( matrix( sample(1:10,100,repl=TRUE) , ncol = 10 ) )
# Hundreds
r2 <- raster( matrix( sample(1e2:1.1e2,100,repl=TRUE) , ncol = 10 ) )
# Thousands
r3 <- raster( matrix( sample(1e3:1.1e3,100,repl=TRUE) , ncol = 10 ) )
# Stack the rasters
s <- stack( r1,r2,r3 )
# Name the layers in the stack
names(s) <- c("LIon_medIan" , "PANTHeR_MEAN_AVG" , "tiger.Mean.JULY_2012")
# Data of points to query on
df <- data.frame( x = runif(10) , y = runif(10) , species = sample( c("lion" , "panther" , "Tiger" ) , 10 , repl = TRUE ) )
# Run the previous code
df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) )
df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) )
# And the result (note the scale of Values is consistent with the scale of values in each rasterLayer in the stack)
df
# x y species layer Value
#1 0.4827577 0.7517476 lion 1 1
#2 0.8590993 0.9929104 lion 1 3
#3 0.8987446 0.4465397 tiger 3 1084
#4 0.5935572 0.6591223 panther 2 107
#5 0.6382287 0.1579990 panther 2 103
#6 0.7957626 0.7931233 lion 1 4
#7 0.2836228 0.3689158 tiger 3 1076
#8 0.5213569 0.7156062 lion 1 3
#9 0.6828245 0.1352709 panther 2 103
#10 0.7030304 0.8049597 panther 2 105