This is a difficult question to answer precisely and cover all corner cases.
Here is my take of a simple (hopefully not overly simplified) explanation:
- Julia code is compiled to native assembly instructions that are next executed.
- If the Julia compiler is able to prove that two implementations are equivalent then you can expect that the same native assembly instructions will be generated for both of them (it is not 100% true, but it is a good approximation from my experience).
- This means that at the native assembly level it does not matter what type you used (built-in or your own) as long as the operation you want to perform and the type information compiler has is the same.
Here is a short example (I am using a struct
but this is the same situation with your own primitive types):
struct A
a::Int
end
function f(x::Int, n)
s = Int[]
for i in 1:n
push!(s,x)
end
s
end
function f(x::A, n)
s = Int[]
for i in 1:n
push!(s,x.a)
end
s
end
function f2(x::A, n)
s = A[]
for i in 1:n
push!(s,x)
end
s
end
Now if you run @code_native f(1, 10^6)
, @code_native f(A(1), 10^6)
, and @code_native f2(A(1), 10^6)
you will see that the generated code is (almost) identical.
You can see the effect of this in benchmarks:
julia> using BenchmarkTools
julia> @btime f(1, 10^6);
8.567 ms (20 allocations: 9.00 MiB)
julia> @btime f(A(1), 10^6);
8.528 ms (20 allocations: 9.00 MiB)
julia> @btime f2(A(1), 10^6);
8.446 ms (20 allocations: 9.00 MiB)
You have the same timings and the same number of allocations.
But now consider the following definition:
struct C
a
end
function f(x::C, n)
s = Int[]
for i in 1:n
push!(s,x.a)
end
s
end
Now benchmarking this function gives:
julia> @btime f(C(1), 10^6);
19.855 ms (21 allocations: 9.00 MiB)
The reason that in type C
field a
can hold any value so the compiler cannot prove that x.a
is an Int
and has to do some additional work because of this. You can verify that this is the case by inspecting @code_warntype f(C(1), 10^6)
against @code_warntype f(A(1), 10^6)
.