I just started with Julia and translated my MATLAB code into Julia (basically line-by-line). I noticed that the Julia code is much slower (like 50x). The original problem is a dynamic programming problem in which I interpolate the value function -- that interpolation is where the code spents most of the time. So I tried to make a minimal example code showing the performance differences. Important things to note is that it's a spline approximation for the interpolation and that the grid is preferrably irregular, i.e. not equally spaced. The MATLAB code:
tic
spacing=1.5;
Nxx = 300;
Naa = 350;
Nalal = 200;
sigma = 10;
NoIter = 500;
xx=NaN(Nxx,1);
xmin = 0.01;
xmax = 400;
xx(1) = xmin;
for i=2:Nxx
xx(i) = xx(i-1) + (xmax-xx(i-1))/((Nxx-i+1)^spacing);
end
f_U = @(c) c.^(1-sigma)/(1-sigma);
W=NaN(Nxx,1);
W(:,1) = f_U(xx);
xprime = ones(Nalal,Naa);
for i=1:NoIter
W_temp = interp1(xx,W(:,1),xprime,'spline');
end
toc
Elapsed time is 0.242288 seconds.
The Julia code:
using Dierckx
function performance()
const spacing=1.5
const Nxx = 300
const Naa = 350
const Nalal = 200
const sigma = 10
const NoIter = 500
xx=Array(Float64,Nxx)
xmin = 0.01
xmax = 400
xx[1] = xmin
for i=2:Nxx
xx[i] = xx[i-1] + (xmax-xx[i-1])/((Nxx-i+1)^spacing)
end
f_U(c) = c.^(1-sigma)/(1-sigma)
W=Array(Float64,Nxx,1)
W[:,1] = f_U(xx)
W_temp = Array(Float64,Nalal,Naa)
interp_spline = Spline1D(xx,W[:,1])
xprime = ones(Nalal,Naa)
for i=1:NoIter
for j=1:Naa
for k=1:Nalal
W_temp[k,j] = evaluate(interp_spline, xprime[k,j])
end
end
end
end
@time(performance())
4.200732 seconds (70.02 M allocations: 5.217 GB, 4.35% gc time)
I made W a 2-d array because in the original problem it is a matrix. I did some research on different interpolation packages but there weren't so much options for irregular grid and splines. MATLAB's interp1 is apparently not available.
My problem is that I was thinking if I write Julia code and it gives the same result as MATLAB, then Julia should be genereally faster. But apparently that is not the case, so that you need to pay SOME attention to your coding. I am not a programmer, of course I try to do my best, but I would like to know whether I am doing some obvious mistake here that can easily be fixed or whether it will happen (too) often that I have to pay attention to my Julia coding -- because then it might not worth it for me to learn it. In the same vein, if I can make Julia faster here (which I am pretty sure I can, e.g. the allocation looks a bit large), I could probably also make MATLAB faster. My hope with Julia was that -- for similar code -- it will run faster than MATLAB.
After some comments about the timing, I also ran this code:
using Dierckx
tic()
const spacing=1.5
const Nxx = 300
const Naa = 350
const Nalal = 200
const sigma = 10
const NoIter = 500
xx=Array(Float64,Nxx)
xmin = 0.01
xmax = 400
xx[1] = xmin
for i=2:Nxx
xx[i] = xx[i-1] + (xmax-xx[i-1])/((Nxx-i+1)^spacing)
end
f_U(c) = c.^(1-sigma)/(1-sigma)
W=Array(Float64,Nxx,1)
W[:,1] = f_U(xx)
W_temp = Array(Float64,Nalal,Naa)
interp_spline = Spline1D(xx,W[:,1])
xprime = ones(Nalal,Naa)
for i=1:NoIter
for j=1:Naa
for k=1:Nalal
W_temp[k,j] = evaluate(interp_spline, xprime[k,j])
end
end
end
toc()
elapsed time:
7.336371495 seconds
Even much slower, uhm...
Another edit: eliminating one loop actually makes it faster in this case, still not comparable to MATLAB though. The code:
function performance2()
const spacing=1.5
const Nxx = 300
const Naa = 350
const Nalal = 200
const sigma = 10
const NoIter = 500
xx=Array(Float64,Nxx)
xmin = 0.01
xmax = 400
xx[1] = xmin
for i=2:Nxx
xx[i] = xx[i-1] + (xmax-xx[i-1])/((Nxx-i+1)^spacing)
end
f_U(c) = c.^(1-sigma)/(1-sigma)
W=Array(Float64,Nxx,1)
W[:,1] = f_U(xx)
W_temp = Array(Float64,Nalal,Naa)
interp_spline = Spline1D(xx,W[:,1])
xprime = ones(Nalal,Naa)
for i=1:NoIter
for j=1:Naa
W_temp[:,j] = evaluate(interp_spline, xprime[:,j])
end
end
end
@time(performance2())
1.573347 seconds (700.04 k allocations: 620.643 MB, 1.08% gc time)
Another edit, now iterating the same number of times through the loop:
function performance3()
const spacing=1.5
const Nxx = 300
const Naa = 350
const Nalal = 200
const sigma = 10
const NoIter = 500
xx=Array(Float64,Nxx)
xmin = 0.01
xmax = 400
xx[1] = xmin
for i=2:Nxx
xx[i] = xx[i-1] + (xmax-xx[i-1])/((Nxx-i+1)^spacing)
end
f_U(c) = c.^(1-sigma)/(1-sigma)
W=Array(Float64,Nxx,1)
W[:,1] = f_U(xx)
W_temp = Array(Float64,Nalal,Naa)
W_tempr = Array(Float64, Nalal*Naa)
interp_spline = Spline1D(xx,W[:,1])
xprime = ones(Nalal,Naa)
xprimer = reshape(xprime, Nalal*Naa)
for i=1:NoIter
W_tempr = evaluate(interp_spline, xprimer)
end
W_temp = reshape(W_tempr, Nalal, Naa)
end
tic()
performance3()
toc()
elapsed time:
1.480349334 seconds
Still, not at all comparable with MATLAB. By the way, in my actual problem I am running the loop easily 50k times and I am accessing bigger xprime matrices although not sure whether that part makes a difference.
interp1
, so my best guess is that that function is highly optimised in performance. Can't say anything about Julia though, as I don't speak that. – Adriaan