Concurrence of Julia Parallel Computing

Question

I am new to Julia and studying Julia parallel computing recently. I am still not clear about the accurate mechanism of Julia's parallelism including macros \@sync and \@async after I read the relevant documents.

The following is the pmap function from the Julia v0.5 documentation:

function pmap(f, lst)
    np = nprocs()  # determine the number of processes available
    n = length(lst)
    results = Vector{Any}(n)
    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    nextidx() = (idx=i; i+=1; idx)
    @sync begin
        for p=1:np
            if p != myid() || np == 1
                @async begin
                    while true
                        idx = nextidx()
                        if idx > n
                            break
                        end
                        results[idx] = remotecall_fetch(f, p, lst[idx])
                    end
                end
            end
        end
    end
    results
end

Is it possible for different two processors/workers call nextidx() at the same time getting the same idx = j? If yes, I feel results[j] will be computed twice and result[j+1] will not be computed.

Thanks very much.

More findings:

function f1()
  i=1
  nextidx()=(idx=i;sleep(1);i+=1;idx)
  for p=1:2
    @async begin
      idx=nextidx()
      println(idx)
      end
  end
end
f1()

The result is 1 1. Through this I find the time periods during which the two tasks call the function nextidx() could overlap. So I feel that in the first code, if np = 3 (i.e. two workers), and the length n of lst is very large, say 10^8, it's possible for the tasks to get the same index. It may happen just because of a coincidence in time, i.e., the two tasks take the expression idx = i at almost the same time point, so the code is not stable. Am I right?

consider reading the excellent answer to this SO question regarding \@sync and \@async. in the pmap example that you cite, each process obtains idx from nextidx() separately from other processes. — Kevin L Keys
I read the answer and now know the basic features of \@sync and \@async. However, I still concern different tasks will might get the same index idx. See my more findings above. Thanks a lot. — user7261265

Kevin L Keys Kevin L Keys · Accepted Answer · 2016-12-08T23:58:20

No, so long as you schedule jobs correctly. The pmap documentation points out that the master process schedules the jobs serially; only the parallel execution is asynchronous. pmap as coded requires no thread locks to ensure correct job scheduling. Adding sleep to nextidx deliberately breaks this feature and introduces the race condition that you observe. Assuming that all of the processes share state -- which is the purpose of the nextidx function -- then the master process will not schedule the same job twice.

Concurrence of Julia Parallel Computing

1 Answers