Running the entire thing faster is what any sane person cares about.
As far as I understand, Julia cares about running stuff multiple times faster, while running it exactly once is always slower because Julia code needs to be compiled before being executed. Unlike Julia, Python doesn't do any JIT compilation and is always ready to run at the same speed.
Julia 1.6
So, I pasted your Julia code into code.jl
and ran it multiple times within the same session:
# New Julia session!
julia> @time include("code.jl")
[long array...]
24.660636 seconds (42.99 M allocations: 2.607 GiB, 3.82% gc time, 0.02% compilation time)
julia> @time include("code.jl")
[long array...]
2.761062 seconds (5.61 M allocations: 240.159 MiB, 10.39% gc time, 57.06% compilation time)
julia> @time include("code.jl")
[long array...]
2.608917 seconds (5.61 M allocations: 240.164 MiB, 4.47% gc time, 61.75% compilation time)
# Restarted Julia
julia> @time include("code.jl")
25.538249 seconds (42.99 M allocations: 2.607 GiB, 3.76% gc time, 0.02% compilation time)
julia> @time include("code.jl")
2.740550 seconds (5.61 M allocations: 240.159 MiB, 9.94% gc time, 56.72% compilation time)
So, it takes about 25 seconds to run your code the first time and around 3 seconds (!) to run it again, even though 50% of these 3 seconds is spent compiling stuff. However, only 0.02% of the initial 25 seconds is spent compiling. Apparently, the slowdown isn't due to compilation time? Also notice how many memory allocations it performs on the first run: 43 million vs around 5.5 million (7 times less!) for the next runs. But anyway, the first run is really slow while subsequent runs are lightning fast.
Loading packages the first time is slow too:
julia> @time using Symbolics
3.503349 seconds (6.42 M allocations: 460.519 MiB, 3.53% gc time, 0.13% compilation time)
julia> @time using Symbolics
0.000241 seconds (136 allocations: 9.641 KiB)
0.000280 seconds (136 allocations: 9.641 KiB)
0.000249 seconds (136 allocations: 9.641 KiB)
0.000251 seconds (136 allocations: 9.641 KiB)
0.000252 seconds (136 allocations: 9.641 KiB)
0.000246 seconds (136 allocations: 9.641 KiB)
# I didn't import it before,
# but apparently `Symbolics` did
julia> @time using QuadGK
0.000276 seconds (137 allocations: 9.688 KiB)
0.000276 seconds (136 allocations: 9.641 KiB)
0.000240 seconds (136 allocations: 9.641 KiB)
0.000251 seconds (136 allocations: 9.641 KiB)
That is, 3.5 seconds are spent just running the first line of your code with the imports. Subsequent imports are obviously faster because of caching, I presume.
The first run of the list comprehension is slow as well
julia> @time m = [i * 10*x^3 + 1/i * sin(x) + 5*i*x^3 * cos(x) - 8i*x^2 + 2/sin(i*3.0)*x + exp(1/(x+10)) for i in 1:500];
2.590259 seconds (4.69 M allocations: 284.672 MiB, 10.86% gc time, 98.69% compilation time)
julia> @time m = [i * 10*x^3 + 1/i * sin(x) + 5*i*x^3 * cos(x) - 8i*x^2 + 2/sin(i*3.0)*x + exp(1/(x+10)) for i in 1:500];
0.102573 seconds (231.21 k allocations: 12.507 MiB, 72.61% compilation time)
0.098871 seconds (231.21 k allocations: 12.508 MiB, 72.39% compilation time)
0.108458 seconds (231.21 k allocations: 12.512 MiB, 7.93% gc time, 67.73% compilation time)
0.099787 seconds (231.22 k allocations: 12.508 MiB, 72.99% compilation time)
0.098378 seconds (231.21 k allocations: 12.507 MiB, 73.80% compilation time)
Again, slow startup (98.69% compilation time), but the next runs are way faster.
Python 3.9.2
~/t/SO_q $ time python3 thecode.py
________________________________________________________
Executed in 5,88 secs
~/t/SO_q $ time python3 thecode.py
________________________________________________________
Executed in 5,90 secs
Executed in 5,36 secs
Executed in 5,39 secs
Executed in 5,35 secs
Executed in 5,36 secs
Executed in 5,77 secs
Executed in 6,10 secs
Executed in 5,38 secs
Thus, Python code consistently runs for about 6 seconds.
Which is 2 times slower than subsequent runs of Julia code! However, you get this kind of speed as soon as you fire up the Python interpreter, while Julia will spend time compiling code and doing... other stuff that requires 43 million memory allocations. But what Julia gives you in exchange for terrible startup times is the performance of compiled code (Julia was 2 times faster than Python in this example).
How to make Julia faster
- Build a custom sysimage. This looks like overkill to me, unless you really need to restart Julia every time to run your code.
- Simply run your code from the same REPL. The simplest variant of this is to
include("your_code.jl")
after modifying the code. This may lead to weird errors because the environment will be populated by data from previous runs.
- Run code in Pluto, which is a notebook that also keeps a live Julia session, but is smart about managing the environment
quadgk(m_d_f, 0, 1)
, which takes 264ms (using@benchmark)
in my computer, vs.integrate_matrix(m, x, 0, 1)
1.36 s ± 15.6 ms (using%timeit
in iPython). It seems that the time you're observing considers compilation time. – user2317421