I am currently using the parallel
package in R and I am trying to make by work reproducible by setting seeds.
However, if you set the seed before creating the cluster and performing the tasks you want in parallel, for some reason, it doesn't make it reproducible. I think I need to set the seed for each core when I make the cluster.
I have made a small example here to illustrate my problem:
library(parallel)
# function to generate 2 uniform random numbers
runif_parallel <- function() {
# make cluster of two cores
cl <- parallel::makeCluster(2)
# sample uniform random numbers
samples <- parallel::parLapplyLB(cl, X = 1:2, fun = function(i) runif(1))
# close cluster
parallel::stopCluster(cl)
return(unlist(samples))
}
set.seed(41)
test1 <- runif_parallel()
set.seed(41)
test2 <- runif_parallel()
# they should be the same since they have the same seed
identical(test1, test2)
In this example, the test1
and test2
should be the same, as they have the same seed, but they return different results.
Can I get some help with where I'm going wrong please?
Note that I've written this example the way I have to mimic how I'm using it right now - there are probably cleaner ways to generate two random uniform numbers in parallel.