Safe caching in Rust without refcounting

Question

I'm trying to write a function that returns some sort of list. Since this function is pure, and I'm likely to call it multiple times, I want to cache the results. To avoid allocations, I want to return references that all point to the same memory location, so that I only need to perform an allocation once per unique call of the function.

In principle, I could store each result as a vector in a cache, and return references to the result vector's slice. This looks unsafe, but (I think) isn't. If you never delete or modify elements of the cache, then your slice references are safe. The vectors might be moved when you add to the cache, but their slices shouldn't be.

Obviously the borrow checker won't accept this solution. As far as it's concerned, the cache is borrowed when your function returns, and you can't call it again because that requires a mutable borrow.

The best solution I have so far is on the playground here, and included below:

use std::collections::HashMap;
use std::rc::Rc;

fn n_trues(n: usize, cache: &mut HashMap<usize, Rc<[bool]>>) -> Rc<[bool]> {
    cache
        .entry(n)
        .or_insert_with(|| vec![true; n].into())
        .clone()
}

fn main() {
    let mut cache = HashMap::new();
    let zero = n_trues(0, &mut cache);
    let one = n_trues(1, &mut cache);
    let other_zero = n_trues(0, &mut cache);
    assert!(Rc::ptr_eq(&zero, &other_zero));
    for x in [zero, one, other_zero].iter() {
        println!("{:?}", x);
    }
}

The Rc satisfies the borrow checker: deleting from the cache won't invalidate the other references, and you can't mutate the contents of an Rc. However, the actual refcounting is superfluous: we ensure that the cache goes out of scope only at the end of the program, so the cached values are not freed until then. This is the sort of thing the borrow checker can check: it can ensure that cache outlives all the results.

Is there any way to eliminate the refcount without resorting to unsafe code?

'If you never delete or modify elements of the cache, then your slice references are safe'. If there are new elements added to cache, then slice elements are not safe. Hence either returning a slice or adding an element to cache should be mutually exclusive or the borrow checker will complain. Is that right ? — Malice

Shepmaster Shepmaster · Accepted Answer · 2017-09-18T20:18:44

If you never delete or modify elements of the cache, then your slice references are safe. The vectors might be moved when you add to the cache, but their slices shouldn't be.

Your analysis appears correct to me.

To be clear, this means that nothing can ever be removed from the HashMap and nothing can ever be added or removed from the Vec inside the HashMap. Doing any of that could cause memory to be reallocated and thus invalidate the references. It also only works because the Vec introduces a level of indirection to the heap which can then be stable.

This looks unsafe, but (I think) isn't.

Safety has very particular meaning in Rust. Specifically, there are types of behavior that are never allowed in any Rust code. Even when using an unsafe block, you are not allowed to invoke that behavior.

Safe Rust is a subset of all Rust that the compiler can guarantee never performs any of the aforementioned issues. Notably, that means that there are certain types of code that do not generate undefined behavior, but that the compiler cannot guarantee. That's where unsafe code comes into play. The programmer, not the compiler, is required to verify the guarantees are never broken, just like they would in C or C++.

Unsafe Rust code isn't bad, it just requires an order of magnitude more programmer scrutiny than safe Rust code. This is still a lot better than applying that level of scrutiny to all of your code (again, as you would in C or C++).

Is there any way to eliminate the refcount without resorting to unsafe code?

There's nothing I'm aware of. What you can do is look for a crate that already does it for you.

One I like to use is typed-arena, which allows allocating many things that all live the same length of time:

extern crate typed_arena;

use typed_arena::Arena;

use std::collections::HashMap;

fn n_trues<'a>(n: usize, slab: &'a Arena<Vec<bool>>, cache: &mut HashMap<usize, &'a [bool]>) -> &'a [bool] {
    cache
        .entry(n)
        .or_insert_with(|| slab.alloc(vec![true; n]))
        .clone()
}

fn main() {
    let slab = Arena::new();
    let mut cache = HashMap::new();
    let zero = n_trues(0, &slab, &mut cache);
    let one = n_trues(1, &slab, &mut cache);
    let other_zero = n_trues(0, &slab, &mut cache);
    assert_eq!(zero.as_ptr(), other_zero.as_ptr());
    for x in [zero, one, other_zero].iter() {
        println!("{:?}", x);
    }
}

Safe caching in Rust without refcounting

1 Answers