63
votes

Rust has the Any trait, but it also has a "do not pay for what you do not use" policy. How does Rust implement reflection?

My guess is that Rust uses lazy tagging. Every type is initially unassigned, but later if an instance of the type is passed to a function expecting an Any trait, the type is assigned a TypeId.

Or maybe Rust puts a TypeId on every type that its instance is possibly passed to that function? I guess the former would be expensive.

1

1 Answers

75
votes

First of all, Rust doesn't have reflection; reflection implies you can get details about a type at runtime, like the fields, methods, interfaces it implements, etc. You can not do this with Rust. The closest you can get is explicitly implementing (or deriving) a trait that provides this information.

Each type gets a TypeId assigned to it at compile time. Because having globally ordered IDs is hard, the ID is an integer derived from a combination of the type's definition, and assorted metadata about the crate in which it's contained. To put it another way: they're not assigned in any sort of order, they're just hashes of the various bits of information that go into defining the type. [1]

If you look at the source for the Any trait, you'll see the single implementation for Any:

impl<T: 'static + ?Sized > Any for T {
    fn get_type_id(&self) -> TypeId { TypeId::of::<T>() }
}

(The bounds can be informally reduced to "all types that aren't borrowed from something else".)

You can also find the definition of TypeId:

pub struct TypeId {
    t: u64,
}

impl TypeId {
    pub const fn of<T: ?Sized + 'static>() -> TypeId {
        TypeId {
            t: unsafe { intrinsics::type_id::<T>() },
        }
    }
}

intrinsics::type_id is an internal function recognised by the compiler that, given a type, returns its internal type ID. This call just gets replaced at compile time with the literal integer type ID; there's no actual call here. [2] That's how TypeId knows what a type's ID is. TypeId, then, is just a wrapper around this u64 to hide the implementation details from users. If you find it conceptually simpler, you can just think of a type's TypeId as being a constant 64-bit integer that the compiler just knows at compile time.

Any forwards to this from get_type_id, meaning that get_type_id is really just binding the trait method to the appropriate TypeId::of method. It's just there to ensure that if you have an Any, you can find out the original type's TypeId.

Now, Any is implemented for most types, but this doesn't mean that all those types actually have an Any implementation floating around in memory. What actually happens is that the compiler only generates the actual code for a type's Any implementation if someone writes code that requires it. [3] In other words, if you never use the Any implementation for a given type, the compiler will never generate it.

This is how Rust fulfills "do not pay for what do you not use": if you never pass a given type as &Any or Box<Any>, then the associated code is never generated and never takes up any space in your compiled binary.


[1]: Frustratingly, this means that a type's TypeId can change value depending on precisely how the library gets compiled, to the point that compiling it as a dependency (as opposed to as a standalone build) causes TypeIds to change.

[2]: Insofar as I am aware. I could be wrong about this, but I'd be really surprised if that's the case.

[3]: This is generally true of generics in Rust.