5
votes

I am running some simulation in R and the code works fine without using parallel computing. However, when I modify one line of my code and try to make use of parallel computing, R stuck and each time it stuck at different times of iteration. When R get stuck, I have to manually stop it from running and sometimes there would be some warning saying

Warning messages:
1: closing unused connection 13 (<-localhost:11688)
2: closing unused connection 12 (<-localhost:11688) 
3: closing unused connection 9 (<-localhost:11688) 
4: closing unused connection 8 (<-localhost:11688) 
5: closing unused connection 7 (<-localhost:11688) 
6: closing unused connection 6 (<-localhost:11688) 

Or something like

Warning message:
In .Internal(get(x, envir, mode, inherits)) :
closing unused connection 6 (<-localhost:11688)

Here is my code:

for (iter in 1:100){
    *Simulate data matrix X and Y, and initial start Z0*

    for (i in 1:100){
    *Calculate input matrix Z based on Z0*

    cl <- makeCluster(no_cores, type="FORK")
    Z <-cbind(Z,unlist(parLapply(cl,
                                 as.list(data.frame(t(Z))),
                                 function(x) prob(x,X,Y))))
    stopCluster(cl)

    result <- rbind(result,Z)
    result <- result[!duplicated(result),]
    result <- result[order(-result[,dim(result)[2]]),][1:10,]
    *Calculate a new Z0 based on Z*
    }
}

where prob is a function returns a vector of length equal to the number of rows of Z.

As the code works fine without using parallel computing, I believe the problem is in parallel computing. Instead of using parLapply, I also tried foreach within the iteration:

cl <- makeCluster(no_cores, type="FORK")
Z <- cbind(Z,foreach(tmp=as.list(data.frame(t(Z))),
                             .combine = c)  %dopar%
                     prob(tmp,X,Y))
stopCluster(cl)

After R gets stuck and I manually stop it from running, I get similar warnings:

Warning messages:
1: closing unused connection 13 (<-localhost:11688) 
2: closing unused connection 12 (<-localhost:11688) 
3: closing unused connection 9 (<-localhost:11688) 
4: closing unused connection 8 (<-localhost:11688) 
5: closing unused connection 7 (<-localhost:11688) 
6: closing unused connection 6 (<-localhost:11688) 
7: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
8: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
9: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation

I am using 3 cores for parallel computing (no_cores=3) and the machine is Macbook pro 2016.

Can someone help me out? Thanks!

1

1 Answers

6
votes

Help me understand more. You're looping with i but I don't see you using it? Am I missing something?

Also, I'd suggest starting your cluster outside of your loop. You're effectively creating and stoping your clusters 10,000 times.

I call makeCluster outside of my loop, and then stop it after my loop finishes.

Try something like this:

cl <- makeCluster(no_cores, type="FORK")

clusterExport(cl, 'Z') # need to export the variable to cluster? 

for (iter in 1:100){
*Simulate data matrix X and Y, and initial start Z0*

for (i in 1:100){
*Calculate input matrix Z based on Z0*

Z <-cbind(Z,unlist(parLapply(cl,
                             as.list(data.frame(t(Z))),
                             function(x) prob(x,X,Y))))


result <- rbind(result,Z)
result <- result[!duplicated(result),]
result <- result[order(-result[,dim(result)[2]]),][1:10,]
*Calculate a new Z0*
    }
}
stopCluster(cl)

If this doesn't work, let me know. I'll help as I can!