Summary of question and answers
Objects of a particular type, say
type Foo
a::A
b::B
end
can be stored in either of two ways:
Inlined (aka by value): in this case, the statement "variable
foo::Foois stored at locationx" effectively means we have a variablefoo.a::Aat locationxand a variablefoo.b::Bat locationx + sizeof(A)(technically the addresses could be a bit more complicated, but that's irrelevant for our purposes).Referenced (aka by reference): "
foo::Foois stored at locationx" means the locationxcontains a pointerfooptr::Ptr{Foo}such that there is a variablefoo.a::Aat locationfooptrandfoo.b::Bat locationfooptr + sizeof(A).
Unlike other languages (I'm looking at you, C/C++), Julia decides by itself whether to store variables inlined or referenced, and it does so based on the properties of the type:
- mutable types -> referenced,
- immutable types -> referenced if at least one of its fields is referenced, inlined otherwise.
There are at least two reasons for this rule:
StefanKarpinski's answer: The garbage collector needs be able to find all pointers to heap-allocated objects on the stack. Currently, Julia ensures this by storing all such pointers on a separate "shadow stack", but if we allowed composite types containing pointers to be placed on the stack then such a neat separation would no longer be possible. Instead, the compiler would need to look for pointers among other variables which poses technical difficulties.
yuyichao's answer: Julia requires the inline/reference decision to be made on a per-type rather than per-object basis, which means a hypothetical type
immutable A a::A endwould have to be infinitely big if we insisted on inlining it. So we would either have to forbid such recursive immutable types, or we could at most allow non-recursive immutable types to be inlined.
Original question
My understanding of memory management in Julia is:
- mutable types -> heap-allocated,
- immutable types and tuples -> stack-allocated unless one of their fields is heap-allocated (i.e. mutable).
I don't quite understand the rationale for this behaviour, however. I've read somewhere that the problem with stack-allocating immutables with pointers to mutables is that then the garbage collector might consider the mutables unreachable and destroy them prematurely. On the other hand, if we place the immutable on the heap then there will still be a pointer to the mutables, so it might seem like we avoided the problem, but actually we just shifted it to making sure that now the immutable itself will not be destroyed.
Can anyone explain this to me who has only very superficial knowledge of how garbage collection works?