8
votes

Suppose that I want to do something in R that would normally (in one process/thread) look like this:

for(i in 1:2) {
    for(j in 1:2) {
        #Do some stuff here
    }
}

Using R's new package parallel, on a quad core machine, can I do the following?

cluster<-makeCluster(4)

innerLoop<-function() {
   #Do some stuff here
}

outerLoop<-function() { 
   result<-do.call(, parLapply(cluster, c(1:2), innerLoop))
}

final.result<-do.call(, parLapply(cluster, c(1:2), outerLoop))

Is this possible with the parallel package that comes with R-2.14.0?

1

1 Answers

12
votes

Yes, you can do that. For the first level of parallelization you have to use distributed memory technology (as makeCluster() from the snow package) and in the second level of parallelization you have to use shared memory technology (multicore package, mclapply()).

Here is a simple code example:

library(parallel)

cl <- makeCluster(2)

inner <- function(x){
    pid <- Sys.getpid()
    name <- Sys.info()["nodename"]
    str <- paste("This is R running on", name, "with PID", pid, "!")
    return(str)
}

outer <- function(x, cores, funceval){
    require(parallel)
    mclapply(1:cores, funceval)
}

parLapply(cl, 1:length(cl), outer, 2, inner)

In the output you should see different machine names and different PIDs!