How efficient are push!() and append!() methods in Julia?

Question

On this page it says that methods push!() and append!() are very efficient.

My question is how exactly efficient they are?

Namely,

If one knows the size of the final array, is it still faster to preallocate the array or growing it incrementally using append!() / push!() would be just as efficient?

Now consider the case when one does not know the size of the final array. For example, merging multiple arrays into 1 big array (call it A).

Two ways to achieve this:

append!()-ing each array to A, whose size has not been preallocated.
First sum dimensions of each array to find the final size of the merged array A. Then preallocate A and copy over contents of each array.

Which one would be more efficient in this case?

No idea what julia-lang is, but you just can't do better than preallocating and copying. Because the memory should be allocated no matter what, and preallocating it is the best that you can do. Of course this applies in the context of a contigious array... — Selçuk Cihan
Would definitely recommend you to check Julia out, especially if you like Python. — aberdysh
check the performance tips: docs.julialang.org/en/release-0.4/manual/performance-tips/… — amrods
They don't say anything about append!() / push!()'s benchmark — aberdysh

Colin T Bowers Colin T Bowers · Accepted Answer · 2016-01-12T22:57:31

The answer to a question like this is usually: "it depends". For example, what size array are you trying to make? What is the element-type of the array?

But if you're just after a heuristic, why not run a simple speed test? For example, the following snippet:

function f1(N::Int)
    x = Array(Int, N)
    for n = 1:N
        x[n] = n
    end
    return(x)
end

function f2(N::Int)
    x = Array(Int, 0)
    for n = 1:N
        push!(x, n)
    end
    return(x)
end

f1(2)
f2(2)

N = 5000000000
@time f1(N)
@time f2(N)

suggests that using push! is about 6 times slower than pre-allocating. If you were using append! to add larger blocks with less steps, the multiplier would almost certainly be less.

When interpreting these numbers, resist the knee-jerk reaction of "What!? 6-times slower!?". This number needs to be placed in the context of how important array building is to your entire program/function/subroutine. For example, if array building comprises only 1% of the run-time of your routine (for most typical routines, array building would comprise much less than 1%), then if your routine runs for 100 seconds, 1 second is spent building arrays. Multiply that by 6 to get 6 seconds. 99 seconds + 6 seconds = 105 seconds. So, using push! instead of pre-allocating increases the runtime of your whole program by 5%. Unless you work in high-frequency trading, you're probably not going to care about that.

For myself, my usual rule is this: if I can pre-allocate easily, then I pre-allocate. But if push! makes the routine much easier to code, with lower possibility of introducing bugs, and less messing around trying to pre-determine the appropriate array size, then I use push! without a second thought.

Final note: if you want to actually look at the specifics of how push! works, you'll need to delve into the C routines, since the julia source just wraps a ccall.

UPDATE: OP questioned in the comments the difference between push! and an operation like array(end+1) = n in MATLAB. I haven't coded in MATLAB recently, but I do keep a copy on my machine since the code for all my older papers is in MATLAB. My current version is R2014a. My understanding is that in this version of MATLAB, adding to the end of the array will re-allocate the entire array. In contrast, push! in Julia works, to the best of my knowledge, much like lists in .NET. The memory allocated to the vector is dynamically added in blocks as the size of the vector grows. This massively reduces the amount of re-allocation that needs to be performed, although my understanding is that some re-allocation is still necessary (I'm happy to be corrected on this point). So push! should work much faster than adding to an array in Matlab. So we can run the following MATLAB code:

N = 10000000;
tic
x = ones(N, 1);
for n = 1:N
    x(n) = n;
end
toc


N = 10000000;
tic
x = [];
for n = 1:N
    x(end+1) = n;
end
toc

I get:

Elapsed time is 0.407288 seconds.
Elapsed time is 1.802845 seconds.

So, about a 5-times slowdown. Given the extreme non-rigour applied in the timing methodology, one might be tempted to say this is equivalent to the Julia case. But wait, if we re-run the exercise in Julia with N = 10000000, the timings are 0.01 and 0.07 seconds. The sheer difference in the magnitude of these numbers to the MATLAB numbers makes me very nervous about making claims about what is actually happening under the hood, and whether it is legitimate to compare the 5-times slowdown in MATLAB to the 6-times slowdown in Julia. Basically, I'm now out of my depth. Maybe someone who knows more about what MATLAB actually does under the hood can offer more insight. Regarding Julia, I'm not much of a C-coder, so I doubt I'll get much insight from looking through the source (which is publicly available, unlike MATLAB).

How efficient are push!() and append!() methods in Julia?

2 Answers