I have some performance problems with parallel computing in Julia. I am new in both, Julia and parallel calculations.
In order to learn, I parallelized a code that should benefits from parallelization, but it does not.
The program estimates the mean of the mean of the components of arrays whose elements were chosen randomly with an uniform distribution.
Serial version
tic()
function mean_estimate(N::Int)
iter = 100000*2
p = 5000
vec_mean = zeros(iter)
for i = 1:iter
vec_mean[i] = mean( rand(p) )
end
return mean(vec_mean)
end
a = mean_estimate(0)
toc()
println("The mean is: ", a)
Parallelized version
addprocs(CPU_CORES - 1)
println("CPU cores ", CPU_CORES)
tic()
@everywhere function mean_estimate(N::Int)
iter = 100000
p = 5000
vec_mean = zeros(iter)
for i = 1:iter
vec_mean[i] = mean( rand(p) )
end
return mean(vec_mean)
end
the_mean = mean(vcat(pmap(mean_estimate,[1,2])...))
toc()
println("The mean is: ", the_mean)
Notes:
- The factor 2 in the fourth line of the serial code is because I tried the code in a PC with two cores.
- I checked the usage of the two cores with htop, and it seems to be ok.
The outputs I get are:
me@pentium-ws:~/average$ time julia serial.jl
elapsed time: 2.68671022 seconds
The mean is: 0.49999736055814215
real 0m2.961s
user 0m2.928s
sys 0m0.116s
and
me@pentium-ws:~/average$ time julia -p 2 parallel.jl
CPU cores 2
elapsed time: 2.890163089 seconds
The mean is: 0.5000104221069994
real 0m7.576s
user 0m11.744s
sys 0m0.308s
I've noticed that the serial version is slightly faster than the parallelized one for the timed part of the code. Also, that there is large difference in the total execution time.
Questions
- Why is the parallelized version slower? (what I am doing wrong?)
- Which is the right way to parallelize this program?
Note: I use pmap with vcat because I wish to try with the median too.
Thanks for your help
EDIT
I measured times as @HighPerformanceMark suggested. The tic()/toc() times are the following. The iteration number is 2E6 for every case.
Array Size Single thread Parallel Ratio
5000 2.69 2.89 1.07
100 000 488.77 346.00 0.71
1000 000 4776.58 4438.09 0.93
I am puzzled about why there is not clear trend with array size.
pmap
is not strictly typed. You should have a type conversion or assertion. – Chris Rackauckas