Best-Practice for Holding/Looping Through a Collection of Objects with Same Abstract Parent Type (Julia)

Question

This is a beginner question, and I'm still thinking "in OOP", so I apologize if I missed the answer in the manual or if the answer is obvious.

Suppose we have an abstract type,

abstract type My_Abstract_type end

and several concrete struct types that are children of that type:

mutable struct Concrete_struct1 <: My_Abstract_type end
mutable struct Concrete_struct2 <: My_Abstract_type end
...

Suppose we have a large amount of objects of the concrete types, and we need to store and loop through those objects. In Python, we could just make a list of the objects, and loop through the list. Similarly, in C++, we could make an array of pointers (of type My_Abstract_type), and loop through that, polymorphically calling everything needed.

However, I can't figure out how to do this cleanly in Julia. We can make an array my_array::Array{My_Abstract_type,1} and then loop through it:

for my_object in my_array
    do_something!(my_object)
end

but, as discussed here https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-abstract-container-1, this comes with a massive performance penalty (it's about 25x slower in my use-case).

One alternative is to do something like:

my_array1::Array{Concrete_struct1,1}
my_array2::Array{Concrete_struct2,1}
my_array3::Array{Concrete_struct3,1}
...

and then

for my_object in my_array1
    do_something!(my_object)
end
for my_object in my_array2
    do_something!(my_object)
end
for my_object in my_array3
    do_something!(my_object)
end

This gives us the performance we want, but is obviously terrible software engineering practice, particularly in cases with large numbers of concrete types. How can we store and loop over these objects in Julia cleanly and without sacrificing performance? Thank you!

Bogumił Kamiński Bogumił Kamiński · Accepted Answer · 2020-08-06T22:32:07

If you do not have more than four concrete types then just use Union of them, as is described here in the Julia manual. For such case the compiler will generate you an efficient code.

If you have very many types then you can use array of arrays:

a = [my_array1, my_array2, my_array3]

and now do

foreach(a) do x
    for my_object in x
        do_something!(my_object)
    end
end

Now a itself will not have a concrete type, but the call to the anonymous function should enable code specialization. This will have some overhead (one dynamic dispatch per element of a), but assuming that you have many more elements per array than types it should be reasonably fast.

Finally if you want to fully avoid dynamic dispatch at the cost of accepting significant compilation cost you can write something like:

processing_fun() = nothing

function processing_fun(x, xs...)
    for my_object in x
        do_something!(my_object)
    end
    processing_fun(xs...)
end

and then call:

processing_fun(a...)

(but it would have to be benchmarked in your specific case if it is beneficial, as it involves recursion, which also has its cost)

Best-Practice for Holding/Looping Through a Collection of Objects with Same Abstract Parent Type (Julia)

1 Answers