speed up a monte carlo simulation with nested loop in R

Question

I would like to speed up the below monte carlo simulation of a DEA estimate

A<-nrow(banks)
effm<-matrix(nrow=A, ncol=2)
m<-20
B<-100

pb <- txtProgressBar(min = 0,
                     max = A, style=3)
for(a in 1:A) {
  x1<-x[-a,]
  y1<-y[-a,]
  theta=matrix(nrow=B,ncol=1) 

  for(i in 1:B){

    xrefm<-x1[sample(1:nrow(x1),m,replace=TRUE),]
    yrefm<-y1[sample(1:nrow(y1),m,replace=TRUE),]
    theta[i,]<-dea(matrix(x[a,],ncol=3),
                   matrix(y[a,],ncol=3),
                   RTS='vrs',ORIENTATION='graph',
                   xrefm,yrefm,FAST=TRUE)
  }

  effm[a,1]=mean(theta)
  effm[a,2]=apply(theta,2,sd)/sqrt(B)
  setTxtProgressBar(pb, a) 
}
close(pb)
effm

Once A becomes large the simulation freezes. i am aware from online research that the apply function rapidly speeds up such code but am not sure how to use it in the above procedure.

Any help/direction would be much appreciated

Barry

There's a lot of misinformation online. The apply function may or may not be faster than a for loop; it depends on what you're doing. You need to profile your code for speed to see what portions are slowest (see ?Rprof), then you will know what needs to be faster. People could help profile your code if you provide a reproducible example. — Joshua Ulrich
@JoshuaUlrich ditto! also, if you can post portions of the data you're using, we will be able to actually run your code which makes it much easier to help — Justin
Can you define "freeze" ? There's a big difference between a process that takes a long time, and one which blows out system memory (or something) and hangs up the process and/or the entire OS. — Carl Witthoft
Would be helpful if we could run this code locally. What is banks? — Roman Luštrik

John John · Accepted Answer · 2012-11-29T15:55:40

The following should be faster.... but if you're locking up when A is large that might be a memory issue and the following is more memory intensive. More information, like what banks is, what x is, y, where you get dea from, and what the purpose is would be helpful.

Essentially all I've done is try to move as much as I can out of the inner loop. The shorter that is, the better off you'll be.

A <- nrow(banks)
effm <- matrix(nrow = A, ncol = 2)
m <- 20
B <- 100
pb <- txtProgressBar(min = 0,
                     max = A, style=3)
for(a in 1:A) {
  x1 <- x[-a,]
  y1 <- y[-a,]
  theta <- numeric(B)
  xrefm <- x1[sample(1:nrow(x1), m * B, replace=TRUE),] # get all of your samples at once
  yrefm <- y1[sample(1:nrow(y1), m * B, replace=TRUE),]
  deaX <- matrix(x[a,], ncol=3)
  deaY <- matrix(y[a,], ncol=3)

  for(i in 1:B){
    theta[i] <- dea(deaX, deaY, RTS = 'vrs', ORIENTATION = 'graph',
                   xrefm[(1:m) + (i-1) * m,], yrefm[(1:m) + (i-1) * m,], FAST=TRUE)
  }

  effm[a,1] <- mean(theta)
  effm[a,2] <- sd(theta) / sqrt(B)
  setTxtProgressBar(pb, a) 
}
close(pb)
effm

speed up a monte carlo simulation with nested loop in R

1 Answers