2
votes

Say I have defined my own object pool struct. Internally it keeps a Vec of all the objects and some data structure that lets it know which items in the vector are currently handed out and which are free. It has an allocate method that returns an index of an unused item in the vector, and a free method to tell the pool at an index in the vector is available to be used again.

Is it possible for me to define the API of my object pool in such a way that the type system and borrow checker will guarantee that I free an object back to the correct pool? This is assuming a situation where I might have multiple instances of the pool that are all the same type. It seems to me that for the regular global allocator rust doesn't have to worry about this problem because there is only one global allocator.

usage example:

fn foo() {
    let new_obj1 = global_pool1.allocate();
    let new_obj2 = global_pool2.allocate();

    // do stuff with objects

    global_pool1.free(new_obj2); // oops!, giving back to the wrong pool
    global_pool2.free(new_obj1); // oops!, giving back to the wrong pool
}
3
Can you give an example of how you would use the API for an object pool, and allocate/free objects from it?Coder-256
So long as you don't explicitly use unsafe code, the compiler will guarantee that everything is dropped once at the end of its respective lifetime without any exceptions. I imagine anything else would likely be considered undefined behavior. If it can't make that guarantee, then it will refuse to compile your code. This is one of the big selling points of rust and is so essential to the rust language that you can always assume it to be true in future updates.Locke
@Coder-256 addedJoseph Garvin
@Locke in this instance everything gets dropped, the problem is invariants in each pool will get messed up, they will mark vector indexes as free that are notJoseph Garvin
@JosephGarvin in that case, you may find it interesting to look at crates.io/crates/generational-arena. It isn't a perfect match for what you are asking, but it deals with a similar problem.Locke

3 Answers

2
votes

First, something to consider is that inserting an item into a Vec can sometimes cause it to reallocate and change addresses, which means that all existing references to items in the Vec become invalid. I imagine you had intended that users could keep references to items in the Vec and simultaneously insert new items, but sadly this is not possible.

One way to solve this is the approach used by generational_arena. Inserting an object returns an index. You can call arena.remove(index) to free the object, and arena.get[_mut](index) to get a reference to the object, borrowing the entire arena.

However let's assume for the sake of argument that you had a way to keep references to the arena while inserting new items and doing whatever other operations you may need. Considering that a reference is essentially a pointer, the answer is no, there is no way to automatically remember where it came from. However, you can create a "smart pointer" similar to Box, Rc, etc. that keeps a reference to the arena in order to free the object when it is dropped.

For example (very rough pseudocode):

struct Arena<T>(Vec<UnsafeCell<T>>);

struct ArenaMutPointer<'a, T> {
    arena: &'a Arena,
    index: usize,
}

impl<T> DerefMut for ArenaPointer<'_, T> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { self.arena[self.index].get() as &mut T }
    }
}

impl<T> Drop for ArenaPointer<'_, T> {
    fn drop(&mut self) {
        self.arena.free(self.index);
    }
}
2
votes

Branding

There have been multiple forays in the idea of using lifetimes as brands, to tie a particular variable to one other variable, and no other.

It has notably been explored in order to obtain indices that are guaranteed to be within bounds: checked once at creation, and always usable afterwards.

Unfortunately, this requires creating invariant lifetimes to prevent the compiler for "merging" multiple lifetimes together, and while possible I have not yet seen any compelling API.

Affine, not Linear.

It is also important to note that Rust does not have a Linear type system, but an Affine one.

A Linear type system is a system where each value is used exactly once, whereas an Affine type system is a system where each value is used at most once.

The consequence, here, is that it is very easy to accidentally forget to return an object to the pool. While it is always safe in Rust to leak objects -- and mem::forget is an easy way to do so -- those cases generally stand out like sore thumbs so are relatively easily audited. On the other hand, merely forgetting to return a value to the pool would result in an accidental leak which may cost quite some time.

Just Drop It!

The solution, therefore, is to just let the value return itself to the pool it came from in its Drop implementation:

  • It makes it impossible to forget to return accidentally.
  • It requires the object to hold a reference to the pool it came from, making it impossible to accidentally return it to the wrong pool.

It does come at a cost, of course, namely an extra 8 bytes stored with the object.

There are two potential solutions, here:

  • Thin pointers: struct Thin<'a, T>(&'a Pooled<'a, T>); where struct Pooled<'a, T>(&'a Pool<T>, T);.
  • Fat pointers: struct Fat<'a, T>(&'a Pool<T>, &'a T);.

For simplicity, I'd advise starting with the Fat alternative: it's simpler.

Then the Drop implementation of Thin or Fat just return the pointer to the pool.

2
votes

You can use a Zero Sized Type (ZST, for short) to get the API you want, without the overhead of another pointer.

Here is an implementation for 2 pools, which can be extended to support any number of pools using a macro to generate the "marker" struct (P1, P2, etc). A major downside is that forgetting to free using the pool will "leak" the memory.

This Ferrous Systems blog post has a number of possible improvements that might interest you, especially if you statically allocate the pools, and they also have a number of tricks for playing with the visibility of P1 so that users wont be able to misuse the API.


use std::marker::PhantomData;
use std::{cell::RefCell, mem::size_of};

struct Index<D>(usize, PhantomData<D>);
struct Pool<D> {
    data: Vec<[u8; 4]>,
    free_list: RefCell<Vec<bool>>,
    marker: PhantomData<D>,
}

impl<D> Pool<D> {
    fn new() -> Pool<D> {
        Pool {
            data: vec![[0,0,0,0]],
            free_list: vec![true].into(),
            marker: PhantomData::default(),
        }
    }
    
    fn allocate(&self) -> Index<D> {
        self.free_list.borrow_mut()[0] = false;
        
        Index(0, self.marker)
    }
    
    fn free<'a>(&self, item: Index<D>) {
        self.free_list.borrow_mut()[item.0] = true;
    }
}

struct P1;
fn create_pool1() -> Pool<P1> {
    assert_eq!(size_of::<Index<P1>>(), size_of::<usize>());
    Pool::new()
}

struct P2;
fn create_pool2() -> Pool<P2> {
    Pool::new()
}


fn main() {
    
    let global_pool1 = create_pool1();
    let global_pool2 = create_pool2();
    
    let new_obj1 = global_pool1.allocate();
    let new_obj2 = global_pool2.allocate();

    // do stuff with objects

    global_pool1.free(new_obj1);
    global_pool2.free(new_obj2);

    global_pool1.free(new_obj2); // oops!, giving back to the wrong pool
    global_pool2.free(new_obj1); // oops!, giving back to the wrong pool
}

Trying to free using the wrong pool results in:

error[E0308]: mismatched types
  --> zpool\src\main.rs:57:23
   |
57 |     global_pool1.free(new_obj2); // oops!, giving back to the wrong pool
   |                       ^^^^^^^^ expected struct `P1`, found struct `P2`
   |
   = note: expected struct `Index<P1>`
              found struct `Index<P2>`

Link to playground

This can be improved a little so that the borrow checker will enforce that Index will not out-live the Pool, using:

fn allocate(&self) -> Index<&'_ D> {
    self.free_list.borrow_mut()[0] = false;
        
    Index(0, Default::default())
}

So you get this error if the pool is dropped while an Index is alive:

error[E0505]: cannot move out of `global_pool1` because it is borrowed
  --> zpool\src\main.rs:54:10
   |
49 |     let new_obj1 = global_pool1.allocate();
   |                    ------------ borrow of `global_pool1` occurs here
...
54 |     drop(global_pool1);
   |          ^^^^^^^^^^^^ move out of `global_pool1` occurs here
...
58 |     println!("{}", new_obj1.0);
   |                    ---------- borrow later used here

Link to playground

Also, a link to playground with Item API (returning an Item, vs only and Index)