This question is specifically related to running a for loop on multiple cores. I am trying to learn how to run a code using parallel cores. The actual code is somewhat complicated so I am recreating a very basic and diluted code here. Note this example is only for illustrative purposes and not the actual code.
library(parallel)
library(foreach)
library(doParallel)
#Creating a mock dataframe
Event_ID = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3)
Type=c("A","B","C","D","E","A","B","C","D","E","A","B","C","D")
Revenue1=c(24,9,51,7,22,15,86,66,0,57,44,93,34,37)
Revenue2=c(16,93,96,44,67,73,12,65,81,22,39,94,41,30)
z = data.frame(Event_ID,Type,Revenue1,Revenue2)
#replicates z 5000 times
n =5000
zz=do.call("rbind", replicate(n, z, simplify = FALSE))
zz$Revenue3 = 0
#################################################################
# **foreach, dopar failed attempt**
#################################################################
cl=parallel::makeCluster(14,type="PSOCK") #I have 8 core 16 threads but use 14 here. Please edit this accordingly.
registerDoParallel(cl)
home1 = function(zz1){
foreach(i=1:nrow(zz1), .combine = rbind) %dopar% {
zz1[i,'Revenue3'] = sqrt(zz1[i,'Revenue1'])+(zz1[i,'Revenue2'])
}
return(zz1)
}
zzz = home1(zz1=zz)
stopCluster(cl)
#################################################################
#Non parallel implementation
#################################################################
home2 = function(zz2){
zz3=zz2
for (i in 1:nrow(zz3)){
zz3[i,'Revenue3'] = sqrt(zz3[i,'Revenue1'])+(zz3[i,'Revenue2'])
}
return(zz3)
}
zzzz=home2(zz2=zz)
I create a dataframe and try to use foreach and dopar but it does not seem to work. Next I provide the implementation of the non parallel version of the code. However, the parallel version does not work for me. The output df that I get is same as the input matrix. I realize that I might be making a basic mistake, but I am not experienced enough to figure out what exactly is wrong. Any help will be appreciated.
P.S. I do realize that my non-parallel version is not optimal and can be improved but this is used for an example.