0
votes

I am trying to extract the area below 5000 feet for a 5 state area using the 30 meter (1 arc second) resolution NED (National Elevation Dataset), by state. I have a shapefile of the states, and a large (about 4GB) raster of the NED. I am using the velox package, which speeds things up significantly, but I feel there has to be a faster way to do this (it still takes a long time).

With the velox package, I make the raster into a velox object, then extract the cells using velox's extract function. Since I know the resolution of each cell, I count the # of cells below 5000 feet in each state and multiply by the area of each cell.

DEM_velox <- velox(NED)
DEM_cells <- DEM$extract(DEM_velox,sp=state_shapefile)

Isn't there a better and much faster way to do this? I know there's the area function from the raster package, and the gArea function from the rgeos package, but those are for polygons (shapefiles), not rasters, as far as I can tell. Any help on this would be great.

1
It would probably speed things up to crop the NED by your 5-state region or by state so that you're dealing with a much smaller object and then mask the smaller raster(s) so that elevations outside the state boundaries are set to NA. Then you should be able to simply do a summation such as this to get the result: sum(values(cropped_masked_raster) <= 5000, na.rm=TRUE) * 30 * 30 to get the area in m^2 that meet your conditionThetaFC
But in general, unless you have some serious computing power, it is going to be slow-going working with this 4 GB dataset. Would be faster to have sets of smaller rasters that cover your AOI and work them in a loop so you are not overwhelming RAM.ThetaFC

1 Answers

1
votes

This is an example of solving your problem for California using a loop with a set of smaller rasters (69 in total) that covered the entire state and avoiding extract.

Raster DEMs were downloaded from the NRCS Spatial Data Gateway.

The extract function was avoided through a mask of each raster that sets elevations outside the state boundary to NA before counting the number of cells below your critical elevation using cellStats along with the function sum. This sums the number of cells below 5000' elevation, because the raster is converted to TRUE/FALSE values through ras_masked <= crit_elev before summation. A separate test indicated the extract approach was 5-6 times slower than the mask approach.

The execution time for the loop was 400 seconds.

library(raster)
crit_elev <- 5000 / 3.2808 #critical elevation in meters
demDir <- yourDEMdirectoryHERE
CADir <- yourStateBoundaryDirectoryHERE
list.files(demDir)
fnames <-list.files(demDir, full.names = TRUE)[grepl('*.tif$', list.files(demDir))]
length(fnames) #69 rasters that cover CA but some do spill over state boundary, so mask is necessary

#read in state boundary and project to DEM crs
list.files(CADir)
CA_TA.shp <- shapefile(file.path(CADir, 'california_CA_TA.shp'))
CA_UTM11N <- spTransform(CA_TA.shp, crs(raster(fnames[1])))

#loop to solve the problem for CA
elev_count <- vector(mode='integer', length = length(fnames)) #this is where we will keep track of how many cells in each dem are below the crit_elev
system.time(for (i in seq_along(fnames)) {
  ras_masked <- mask(raster(fnames[i]), CA_UTM11N)
  elev_count[i] <- cellStats(ras_masked <= crit_elev, sum)
  print(i)
})
#user  system elapsed 
#175.76  224.67  400.70 
#count cells below 5000 ft, multiply by resolution (30 m) and divide by 10,000 to get hectares
sum(elev_count) * 30 * 30 / 10000 
#[1] 19247149