14
votes

I'm experiencing slowness when creating clusters using the parallel package.

Here is a function that just creates and then stops a PSOCK cluster, with n nodes.

library(parallel)
library(microbenchmark)
f <- function(n)
{
  cl <- makeCluster(n)
  on.exit(stopCluster(cl))
}
microbenchmark(f(2), f(4), times = 10)
## Unit: seconds
##  expr      min       lq   median       uq      max neval
##  f(2) 4.095315 4.103224 4.206586 5.080307 5.991463    10
##  f(4) 8.150088 8.179489 8.391088 8.822470 9.226745    10   

My machine (a reasonably modern 4-core workstation running Win 7 Pro) is taking about 4 seconds to create a two node cluster and 8 seconds to create a four node cluster. This struck me as too slow, so I tried the same profiling on a colleague's identically specced machine, and it took one/two seconds for the two tests respectively.

This suggested I may have some odd configuration set up on my machine, or that there is some other problem. I read the ?makeCluster and socketConnection help pages, but did not see anything related to improving performance.

I had a look in the Windows Task Manager while the code was running: there was no obvious interference with anti-virus or other software, just an Rscript process running at ~17% (less than one core).

I don't know where to look to find the source of the problem. Are there any known causes of slowness with PSOCK cluster creation under Windows?

Is 8 seconds to create a 4-node cluster actually slow (by 2014 standards), or are my expectations too high?

1
8 seconds is pretty slow. It takes about 1 second on my 3 years old win7 workstation. - Roland
Takes 11 seconds on my new win8 pc. - Jonas Tundo
1.6 and 3.2 seconds on my 3 year old i7 with W7. - Roman Luštrik
@JT85 All the more reason for me to avoid buying a Win8 PC :-) - Tyler Rinker
.634 and 1.271,and .638 and 1.272 using snowfall package `f2(n){sfInit(parallel=T,cpus=n); sfStop()} on Ubuntu 13.04 - James Tobin

1 Answers

14
votes

To monitor what was happening, I installed and opened Process Monitor (HT @qethanm). I also exited most of the things in my system tray like Dropbox, in order to generate less noise. (Though in the end, this didn't make a difference.)

I then re-ran a simplified version of the R code in the question, directly from R GUI (instead of an IDE).

microbenchmark(f(4), times = 5)

After some digging, I noticed that R GUI spawns an Rscript process for each cluster that it creates (see picture).

the process tree shows an Rscript instance for each node in each cluster

After many dead ends and wild goose chases, it occurred to me that perhaps these Rscript instances weren't vanilla R. I renamed my Rprofile.site file to hide it and repeated the benchmark.

This time, a 4 node cluster was created, on average, in just under a second.

For a four node cluster, the Rprofile.site file (and presumably the personal startup file, ~/.Rprofile, if it exists) gets read four times, which can slow things down considerably. Pass rscript_args = c("--no-init-file", "--no-site-file", "--no-environ") to makeCluster to avoid this behaviour.