R: Error in parLapply - $ invalid for atomic vectors only occurs running in parallel

Question

I tried to look for a duplicate question and I know many people have asked about parLapply in R so I apologize if I missed one that is applicable to my situation.

Problem: I have the following function that runs correctly in R but when I try to run it in parallel using parLapply (I'm on a windows machine) I get the error that $ operator is invalid for atomic vectors. The error mentions that 3 nodes produced the errors no matter how many nodes I set my cluster at, for example I have 8 cores on my desktop so I set the cluster to 7 nodes. Here is example code showing where the problem is:

library(parallel)
library(doParallel)
library(arrangements)

#Function

 perms <- function(inputs)
  {
    x <- 0
    L <- 2^length(inputs$w)
     ip <- inputs$ip
    for( i in 1:L)
    {
      y <- ip$getnext()%*%inputs$w
      if (inputs$t >= y)
      {
        x <- x + 1
      }
    }
    return(x)
  }

#Inputs is a list of several other variables that are created before this 
#function runs (w, t_obs and iperm), here is a reproducible example of them
#W is derived from my data, this is just an easy way to make a reproducible example


  set.seed(1)
  m <- 15
  W <- matrix(runif(15,0,1))
  iperm <- arrangements::ipermutations(0:1, m, replace = T)
  t_obs <- 5

  inputs <- list(W,t_obs, iperm)
  names(inputs) <- c("w", "t", "ip")

#If I run the function not in parallel
perms(inputs)

#It gives a value of 27322 for this example data

This runs exactly as it should, however when I try the following to run in parallel I get an error

#make the cluster
  cor <- detectCores()
  cl<-makeCluster(cor-1,type="SOCK")

#passing library and arguments
  clusterExport(cl, c("inputs"))
  clusterEvalQ(cl, {
    library(arrangements)
  })

  results <- parLapply(cl, inputs, perms)

I get the error:

Error in checkForRemoteErrors(val) : 
  3 nodes produced errors; first error: $ operator is invalid for atomic vectors

However I've checked to see if anything is an atomic vector using is.atomic(), and using is.recursive(inputs) it says this is TRUE.

My question is why am I getting this error when I try to run this using parLapply when the function otherwise runs correctly and is there a reason is says "3 nodes produced errors" even when I have 7 nodes?

Perhaps a typo, but you never define m used in ipermutations. — r2evans
@r2evans yes a typo, m is defined as 15 elsewhere in the code, I've added that. I don't think I would need to pass that in clusterExport since only iperm depends on it and I pass iperm because it is a part of inputs. — RAND

r2evans r2evans · Accepted Answer · 2019-08-02T18:18:46

It says "3 nodes" because, as you're passing it to parLapply, you are only activating three nodes. The first argument to parLapply should be a list of things, each element to pass to each node. In your case, your inputs is a list, correct, but it is being broken down, such that your three nodes are effectively seeing:

# node 1
perms(inputs[[1]]) # effectively inputs$w
# node 2
perms(inputs[[2]]) # effectively inputs$t
# node 3
perms(inputs[[3]]) # effectively inputs$ip
# nodes 4-7 idle

You could replicate this on the local host (not parallel) with:

lapply(inputs, perms)

and when you see it like that, perhaps it becomes a little more obvious what is being passed to your nodes. (If you want to see if further, do debug(perms) then run the lapply above, and see what the inputs inside that function call looks like.)

To get this to work once on one node (I think not what you're trying to do), you could do

parLapply(cl, list(inputs), perms)

But that's only going to run one instance on one node. Perhaps you would prefer to do something like:

parLapply(cl, replicate(7, inputs, simplify=FALSE), perms)

R: Error in parLapply - $ invalid for atomic vectors only occurs running in parallel

3 Answers