How to overcome memory constraints using foreach

Question

I am trying to process > 10000 xts objects saved on disk each being around 0.2 GB when loaded into R. I would like to use foreach to process these in parallel. My code works for something like 100 xts objects which I pre-load in memory, export etc. but after > 100 xts objects I hit memory limits on my machine.

Example of what I am trying to do:

require(TTR)
require(doMPI)
require(foreach)

test.data <- runif(n=250*10*60*24)

xts.1 <- xts(test.data, order.by=as.Date(1:length(test.data)))
xts.1 <- cbind(xts.1, xts.1, xts.1, xts.1, xts.1, xts.1)

colnames(xts.1) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")

print(object.size(xts.1), units="Gb")

xts.2 <- xts.1
xts.3 <- xts.1
xts.4 <- xts.1

save(xts.1, file="xts.1.rda")
save(xts.2, file="xts.2.rda")
save(xts.3, file="xts.3.rda")
save(xts.4, file="xts.4.rda")

names <- c("xts.1", "xts.2", "xts.3", "xts.4")

rm(xts.1)
rm(xts.2)
rm(xts.3)
rm(xts.4)

cl <- startMPIcluster(count=2) # Use 2 cores
registerDoMPI(cl)

result <- foreach(name=names, 
                  .combine=cbind, 
                  .multicombine=TRUE, 
                  .inorder=FALSE, 
                  .packages=c("TTR")) %dopar% {
    # TODO: Move following line out of worker. One (or 5, 10,
    # 20, ... but not all) object at a time should be loaded 
    # by master and exported to worker "just in time"
    load(file=paste0(name, ".rda"))

    return(last(SMA(get(name)[, 1], 10)))
}

closeCluster(cl)

print(result)

So I am wondering how I would be able to load each (or several like 5, 10, 20, 100, ... but not all at once) xts object from disk "just in time" before they are sent/needed by/exported to worker(s). I can not load the object in worker (based on name and folder where it is stored on disk) since workers can be on remote machines without access to the folder where objects are stored on disk. So I need to be able to read/load them "just in time" in main process...

I am using doMPI and doRedis as parallel back-end. doMPI seems more memory efficient but slower than doRedis (on 100 objects).

So I would like to understand what is a proper "strategy"/"pattern" to approach this problem.

Are you sure your code is CPU-bound? It seems like it could be I/O-bound, which means running on multiple CPUs won't help (especially since the workers can't access the disk). — Joshua Ulrich
@JoshuaUlrich Well, in reality I actually plan to run something like 50 times quantstrat::applyIndicators and quantstrat::applySignals. So the I/O stuff has impact, for sure, but I would really like to be able to do it in parallel. This is for example for backtesting over 10 years of history. In "real time" scenario I need to load only history of objects as far as the largest lookback period from particular run time moment. I am really interested if the loading can be moved out of workers into master and objects then supplied one by one (or in manageable groups) to workers. — Samo

Steve Weston Steve Weston · Accepted Answer · 2014-07-18T18:54:20

In addition to using doMPI or doRedis, you need to write a function that returns an appropriate iterator. There are a number of examples in my vignette "Writing Custom Iterators" from the iterators package that should be helpful, but here's a quick attempt at such a function:

ixts <- function(xtsnames) {
  it <- iter(xtsnames)

  nextEl <- function() {
    xtsname <- nextElem(it)  # throws "StopIteration"
    load(file=paste0(xtsname, ".rda"))
    get(xtsname)
  }

  obj <- list(nextElem=nextEl)
  class(obj) <- c('ixts', 'abstractiter', 'iter')
  obj
}

This is really simple since it's basically a wrapper around an iterator over the "names" variable. The vignette uses this technique for several of the examples.

You can use "ixts" with foreach as follows:

result <- foreach(xts=ixts(names),
                  .combine=cbind, 
                  .multicombine=TRUE, 
                  .inorder=FALSE, 
                  .packages=c("TTR")) %dopar% {
    last(SMA(xts[, 1], 10))
}

Although this iterator will work with any foreach backend, not all backends will call it just-in-time. doMPI and doRedis will, but doParallel and doMC get all of the values from the iterator up front because clusterApplyLB and mclapply require the values to all be in a list. doMPI and doRedis were designed to work with iterators in order to be more memory efficient.

How to overcome memory constraints using foreach

1 Answers