I have trouble getting the intersection between two large SpatialPolygonsDataFrame on R. My polygons data represent buildings and administrative boundaries, and I am trying to get the intersection polygons between them.
I understand that the intersect function from the raster package and gIntersection from the rgeos package can do this job (with a few differences) but they cannot handle all my polygons at once (about 50.000 polygons/entity).
For this reason, I have to split my calculation within a loop, saving the result for each step. The problem is: these functions keep filling my physical memory, and I cannot clean it. I tried using rm() and gc(), but it does not change a thing. The memory issue crashes my R session, and I cannot do my calculation.
Is there a way to free the RAM during simulation, within loops ? Or to avoid this memory issue ?
Here comes a reproducible example, for random polygons.
library(raster)
library(sp)
library(rgeos)
#Generating 50000 points (for smaller polygons) and 150000 (for larger polygons) in a square of side 100000
size=100000
Nb_points1=50000
Nb_points2=150000
start_point=matrix(c(sample(x = 1:size,size = Nb_points1,replace = T),sample(x = 1:size,size = Nb_points1,replace = T)),ncol=2)
start_point2=matrix(c(sample(x = 1:size,size = Nb_points2,replace = T),sample(x = 1:size,size = Nb_points2,replace = T)),ncol=2)
#Defining different sides length
radius=sample(x = 1:50,size = Nb_points1,replace = T)
radius2=sample(x = 1:150,size = Nb_points2,replace = T)
#Generating list of polygons coordinates
coords=list()
for(y in 1:Nb_points1){
xmin=max(0,start_point[y,1]-radius[y])
xmax=min(size,start_point[y,1]+radius[y])
ymin=max(0,start_point[y,2]-radius[y])
ymax=min(size,start_point[y,2]+radius[y])
coords[[y]]=matrix(c(xmin,xmin,xmax,xmax,ymin,ymax,ymax,ymin),ncol=2)
}
coords2=list()
for(y in 1:Nb_points2){
xmin=max(0,start_point2[y,1]-radius2[y])
xmax=min(size,start_point2[y,1]+radius2[y])
ymin=max(0,start_point2[y,2]-radius2[y])
ymax=min(size,start_point2[y,2]+radius2[y])
coords2[[y]]=matrix(c(xmin,xmin,xmax,xmax,ymin,ymax,ymax,ymin),ncol=2)
}
#Generating 75000 polygons
Poly=SpatialPolygons(Srl = lapply(1:Nb_points1,function(y) Polygons(srl = list(Polygon(coords=coords[y],hole = F)),ID = y)),proj4string = CRS('+init=epsg:2154'))
Poly2=SpatialPolygons(Srl = lapply(1:Nb_points2,function(y)Polygons(srl = list(Polygon(coords=coords2[y],hole = F)),ID = y)),proj4string = CRS('+init=epsg:2154'))
#Union of overlapping polygons
aaa=gUnionCascaded(Poly)
bbb=gUnionCascaded(Poly2)
aaa=disaggregate(aaa)
bbb=disaggregate(bbb)
intersection=gIntersects(spgeom1 = aaa,bbb,byid = T,returnDense = F)
#Loop on the intersect function
pb <- txtProgressBar(min = 0, max = ceiling(length(aaa)/1000), style = 3)
for(j in 1:ceiling(length(aaa)/1000)){
tmp_aaa=aaa[((j-1)*1000+1):(j*1000),]
tmp_bbb=bbb[unique(unlist(intersection[((j-1)*1000+1):(j*1000)])),]
List_inter=intersect(tmp_aaa,tmp_bbb)
gc()
gc()
gc()
setTxtProgressBar(pb, j)
}
Thank you !
gdalUtils
. – lokigdalUtils
is a very good and useful package but it won't help here. It's mostly to play with rasters. You use the raster package, but not on rasters so I doubt it would help. – BastienRSAGA
is my favorite follwed byRQGIS
and than more complicatedRGRASS7
. All need you to install the appropriate software (can be done one shot with OSGEO4W). They should succeed at your task. I'm kind of busy right now, if I have chance later on I'll post an example. – Bastien