2
votes

Context

I have a DataStore<Key,Value> trait that abstracts out data storage. (For example, I can create a simple implementation of this trait for data stores that wrap Vecs and HashMaps.) I would like this abstraction because some use cases/targets require small but computationally inefficient stores and others allow for larger stores. (Edit: added references to self in trait definition, below.)

// Stores data of type V indexed by K
trait DataStore<K, V> {
    fn new() -> Self;
    fn get(&self, k: K) -> Option<&V>;
    fn insert(&mut self, v: V) -> Option<K>;
}

I now want to define a struct Thing that contains two data stores: a store on one type, Apple<T>, and a store on another type, Banana<T>. Here is my first attempt,

// some objects I'd like to keep in DataStores
struct Apple<T> { shine: T }
struct Banana<T> { spottedness: T }

// Attempt #1: cumbersome, have to always specify generic constraints when 
//             using Thing elsewhere
pub struct Thing<K, T, AppleStore, BananaStore>
    where AppleStore: DataStore<K, Apple<T>>,
          BananaStore: DataStore<K, Banana<T>>
{
    apple_store: AppleStore,
    banana_store: BananaStore,
}

This approach is cumbersome to work with since I have to always type out <K, T, AppleStore, BananaStore> where ... whenever I want to pass Thing to a function or implement a trait for Thing even if said function or trait doesn't care about either of the two stores. For example, if I want to implement a trait for Thing that does some unrelated operations to other attributes with type T I still have to tell it about K, AppleStore, and BananaStore.

I learned about type aliases and tried the following:


// Attempt #2: looks easier to use. only two generics on Thing: the type of
//             the indexes and the type of the internal parameters. not sure
//             about the role of dyn, though, since this should be checkable
//             at compile time
type AppleStore<K, T> = dyn DataStore<K, Apple<T>>;
type BananaStore<K, T> = dyn DataStore<K, Banana<T>>;


pub struct Thing<K, T> {
    apple_store: AppleStore<K, T>,
    banana_store: BananaStore<K, T>,
}

A new problem appears when I try to create a new BananaStore in Thing's constructor. This is allowed in Attempt #1 since traits are allowed to implement functions that (1) do not take &self as argument and (2) return type Self. But this is not allowed in Attempt #2 because dynamic traits need things to be Sized and that's not allowed with Self returns. (Or something?)


impl<K, T> Thing<K, T> {
    pub fn new(apple_store: AppleStore<K, T>) {
        Thing {
            apple_store: apple_store,
            banana_store: BananaStore::new() // not allowed to do this with
                                             // dynamic type aliases?
    }
}

Question

Do I need to create a BananaStore outside of Thing and pass it in as a parameter or is there a way to hide the construction of BananaStore from the outside? I suppose something like a ThingBuilder may be a valid approach if one of my goals is to hide unnecessary (optional) object creation. But I also don't want to provide a default implementer of BananaStore: the user should explicitly declare what kind of DataStore is used for BananaStore.

I formulate the problem in this way because eventually I want Thing's AppleStore to actually be shared among multiple Thing instances; that is, multiple Things can reference the same Apple<T> in a store. But each Thing will have it's own BananaStore. I know this will require using Rc or Arc or something like that on AppleStore but I will cross that bridge when I get to it.

1
Have you considered just not putting bounds on Thing? Doing so is often a bad idea, especially if you have a lot of code that does not depend on those bounds; it just constrains code that has no need for the constraints.trentcl

1 Answers

2
votes

DataStore

Your DataStore has a few issues that are causing problems for you. The get and set functions need a reference to self to work. Not having a reference to self implies they are able to produce references from nothing and will cause you lifetime issues going forward. The get function should also accept a reference to match how a Map functions. By adding the reference remove the constraint that K implements Copy and prevent future issues with lifetimes.

trait DataStore<K, V> {
    // This will make it impossible to use this an anonymous trait like in attempt 2
    fn new() -> Self;

    // They need a reference to self to store and retrieve data
    fn get(&self, k: &K) -> Option<&V>;
    fn insert(&mut self, v: V) -> Option<K>;
}

These small changes should help make debugging a bit easier from now on.

Attempt 1

With that tweak to DataStore, we have already resolved the issue with lifetimes. All you need to do is add in a PhantomData to compensate for the extra type parameters.

I'm also going to remove the where clause in my example since I have had issues with it in the past when using adding it to structs.

// AppleStore and DataStore converted to single letters to make it a bit more concise
struct Thing<K, T, A: DataStore<K, Apple<T>>, B: DataStore<K, Banana<T>>> {
    apple_store: A,
    banana_store: B,
    _phantom: PhantomData<(K, T)>,
}

Other than that, this should probably be your go to choice. Since it doesn't involve anonymous traits like in attempt 2, there shouldn't be any issues with lifetimes going forward and it is much easier to work with.

Attempt 2

The reason this attempt fails is because the dynamically sized anonymous traits make it so the compiler isn't sure one field ends in memory and the next begins. We can fix this by boxing the anonymous traits first. You can think of this as storing pointers instead of the actual data within the struct.

pub struct Thing<K, T> {
    apple_store: Box<dyn DataStore<K, Apple<T>> + 'static>,
    banana_store: Box<dyn DataStore<K, Banana<T>> + 'static>,
}

The 'static essentially just means that DataStore won't contain any references which may constrain its lifetime. You could also leave the lifetimes unconstrained, but that would almost defiantly cause you more problems in the future.

pub struct Thing<'a, 'b, K, T> {
    apple_store: Box<dyn DataStore<K, Apple<T>> + 'a>,
    banana_store: Box<dyn DataStore<K, Banana<T>> + 'b>,
}