I'm experiencing slowness when creating clusters using the parallel package.
Here is a function that just creates and then stops a PSOCK cluster, with n nodes.
library(parallel)
library(microbenchmark)
f <- function(n)
{
cl <- makeCluster(n)
on.exit(stopCluster(cl))
}
microbenchmark(f(2), f(4), times = 10)
## Unit: seconds
## expr min lq median uq max neval
## f(2) 4.095315 4.103224 4.206586 5.080307 5.991463 10
## f(4) 8.150088 8.179489 8.391088 8.822470 9.226745 10
My machine (a reasonably modern 4-core workstation running Win 7 Pro) is taking about 4 seconds to create a two node cluster and 8 seconds to create a four node cluster. This struck me as too slow, so I tried the same profiling on a colleague's identically specced machine, and it took one/two seconds for the two tests respectively.
This suggested I may have some odd configuration set up on my machine, or that there is some other problem. I read the ?makeCluster and socketConnection help pages, but did not see anything related to improving performance.
I had a look in the Windows Task Manager while the code was running: there was no obvious interference with anti-virus or other software, just an Rscript process running at ~17% (less than one core).
I don't know where to look to find the source of the problem. Are there any known causes of slowness with PSOCK cluster creation under Windows?
Is 8 seconds to create a 4-node cluster actually slow (by 2014 standards), or are my expectations too high?
