I am relatively new to Julia and I am having some issues when trying to parallelise.
I have tried both pmap and @parallel approaches and I encounter the same issue.
When I run something like:
addprocs(7)
A0=zeros(a_size, b_size, c_size)
A=SharedArray{Float64}(a_size,b_size,c_size)
toler=1e-3
maxit=1000
while (metric1>toler) && (iter1<maxit)
`@inbounds` `@sync` `@parallel` for i in 1:c_size
A[:,:,i]=compute_A(fs,A0[:,:,i],i)
end
A_new=sdata(A)
metric1=maximum(abs.((A_new-A0)))
A0=copy(A_new)
iter1=iter1+1
println("$(iter1) $(metric1)")
end
where the inputs of the function compute_A are:
fsisDataTypedefined by meA0is an arrayiis the index I'm looping over (dimension c_size)
this seems to be working fine (even if instead of shared arrays and @parallel loop I use pmap)
However, when I use a wrap up function for this code, like:
wrap(fs::DataType, toler::Float64, maxit::Int)
A0=zeros(a_size, b_size, c_size)
A=SharedArray{Float64}(a_size,b_size,c_size)
while (metric1>toler) && (iter1<maxit)
`@inbounds` `@sync` `@parallel` for i in 1:c_size
A[:,:,i]=compute_A(fs,A0[:,:,i],i)
end
A_new=sdata(A)
metric1=maximum(abs.((A_new-A0)))
A0=copy(A_new)
iter1=iter1+1
println("$(iter1) $(metric1)")
end
end
Calling this wrap(fs, 1e-3, 1000) function runs WAY SLOWER than the other one (like 6 vs 600 seconds).
It seems extremely weird and I don't understand what I am doing wrong, but there is definitely something I'm missing, so I was hoping I could get some help here.
I am using Julia v0.6.0.
Thanks a lot for your time and help.