Collecting results of @parallel for-loop via remotecall

Question

I use the @parallel for macro to run simulations for a range of parameters. Each run results in a 1-dimensional vector. In the end I would like to collect the results in a DataFrame.

Up until now I had always created an intermediate array and reduced the for-loop with vcat; then constructed the DataFrame. I thought it might also work to push! the result of each calculation to the master process via remotecall. A minimal example would look like

X=Float64[]

@sync @parallel for i in linspace(1.,10.,10)
        remotecall_fetch(()->push!(X,i),1)
       end

The result of which is consistently an array X with 9 not 10 elements. The number of dropped elements becomes larger as more workers are added.

This is on julia-0.6.1.

I thought I had understood julia's parallel computing structure, but it seems not.

What is the reason for this behavior? And how can I do it better and safely?

Simon Byrne Simon Byrne · Accepted Answer · 2017-11-29T21:29:42

I suspect you're triggering a race condition, though couldn't say where.

If you only need to return one value per iteration, I would suggest just using pmap:

pmap(linspace(1.,10.,10)) do i
    i
end

otherwise if each iteration could return multiple values, it would probably best to use RemoteChannels.

Collecting results of @parallel for-loop via remotecall

1 Answers