6
votes

I'm doing R code optimization with Rcpp and parallel computing on Windows. I have a trouble calling Rcpp function in parLapply. The example is following

Rcpp code (test.cpp)

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector payoff( double strike, NumericVector data) {
    return pmax(data - strike, 0);
}

R code

library(parallel)
library(Rcpp)

sourceCpp("test.cpp")

strike_list <- as.list(seq(10, 100, by = 5))

data <- runif(10000) * 50

# One core version
strike_payoff <- lapply(strike_list, payoff, data)

# Multiple cores version
numWorkers <- detectCores()
cl <- makeCluster(numWorkers, type = "PSOCK")
clusterExport(cl = cl,varlist = "payoff")
strike_payoff <- parLapply(cl, strike_list, payoff, data)

Error for parallel version

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: NULL value passed as symbol address   

I know that this is a Windows issue, as mclapply works well on Linux, but I don't have as powerful Linux machine as with Windows.

Any ideas how to fix it?

1
Please clarify: You run the code on a win machine and the server is also a win machine?Roland
I have only local Windows machine. No serverkismsu

1 Answers

8
votes

You need to run the sourceCpp() call in each spawned process, or else get them your code. Right now the main process has the function, the spawned workers do not.

Easiest way is by building a package and have it loaded by each worker process.