0
votes

Summary:

I'm trying to calculate the area of a large number of polygons in R. I've read a few posts about how I might do this (Example #1 & Example #2) but the problem I'm having is that my shapefile is too large (1.7gb) to import. Given I can't import the file, I can't calculate the area of the polygons.

Extended Explanation:

I'm actually trying to calculate the area of properties in Victoria, Australia. The polygons represent these properties. I downloaded the simplified models 1 and 2 of VicMaps from Spatial Datamart for all of Victoria. However, given the size of the shapefiles, I had to narrow my search to just one local government area (LGA) and calculated the polygon areas (just for testing). The shapefile was 15.5MB.

library(raster)
x <- shapefile("D:/Downloads/SDM616230/ll_gda94/shape/lga_polygon/ballarat/VMPROP/PROPERTY_PRIMARY_APPROVED.shp")
crs(x)
x$area_sqkm <- area(x) / 1000000

This worked but its not a practical solution to my problem given there's many LGAs in Victoria and I plan to eventually follow the same process for Queensland and NSW.

However, trying to load a larger shapefile doesn't work and results in the below error code "Error: memory exhausted (limit reached?)".

I've tried using readShapePoly, readogr, st_read and read_sf to get the large shapefile into R but they don't work. I think the file is just too large. I tried using a select query within read_sf in an effort to reduce the size of the file I was reading but that didn't work either. I've read online that I should seek to split the shapefile into just the data I need to reduce the size - but I have no idea how to do that.

Hope you can help.

1

1 Answers

0
votes

Obviously the file is too big for a single box. I think the options then are either

1) split the files into smaller ones, process one by one. See

https://gis.stackexchange.com/questions/195508/split-a-shapefile-into-smaller-files-on-linux-command-line

2) use some dbms or data warehouse to do it, they do such batching automatically.