4
votes

Consider the pattern where there are several states registered with a dispatcher and each state knows what state to transition to when it receives an appropriate event. This is a simple state transition pattern.

struct Dispatcher {
    states: HashMap<Uid, Rc<RefCell<State>>>,
}
impl Dispatcher {
    pub fn insert_state(&mut self, state_id: Uid, state: Rc<RefCell<State>>) -> Option<Rc<RefCell<State>>> {
        self.states.insert(state_id, state)
    }
    fn dispatch(&mut self, state_id: Uid, event: Event) {
        if let Some(mut state) = states.get_mut(&state_id).cloned() {
            state.handle_event(self, event);
        }
    }
}

trait State {
    fn handle_event(&mut self, &mut Dispatcher, Event);
}

struct S0 {
    state_id: Uid,
    move_only_field: Option<MOF>,
    // This is pattern that concerns me.
}
impl State for S0 {
    fn handle_event(&mut self, dispatcher: &mut Dispatcher, event: Event) {
        if event == Event::SomeEvent {
            // Do some work
            if let Some(mof) = self.mof.take() {
                let next_state = Rc::new(RefCell::new(S0 {
                    state_id: self.state_id,
                    move_only_field: mof,
                }));
                let _ = dispatcher.insert(self.state_id, next_state);
            } else {
                // log an error: BUGGY Logic somewhere
                let _ = dispatcher.remove_state(&self.state_id);
            }
        } else {
            // Do some other work, maybe transition to State S2 etc.
        }
    }
}

struct S1 {
    state_id: Uid,
    move_only_field: MOF,
}
impl State for S1 {
    fn handle_event(&mut self, dispatcher: &mut Dispatcher, event: Event) {
        // Do some work, maybe transition to State S2/S3/S4 etc.
    }
}

With reference to the inline comment above saying:

// This is pattern that concerns me.

S0::move_only_field needs to be an Option in this pattern because self is borrowed in handle_event, but I am not sure that this is best way to approach it.

Here are the ways I can think of with demerits of each one:

  1. Put it into an Option as I have done: this feels hacky and every time I need to check the invariant that the Option is always Some otherwise panic! or make it a NOP with if let Some() = and ignore the else clause, but this causes code-bloat. Doing an unwrap or bloating the code with if let Some() feels a bit off.
  2. Get it into a shared ownership Rc<RefCell<>>: Need to heap allocate all such variables or construct another struct called Inner or something that has all these non-clonable types and put that into an Rc<RefCell<>>.
  3. Pass stuff back to Dispatcher indicating it to basically remove us from the map and then move things out of us to the next State which will also be indicated via our return value: Too much coupling, breaks OOP, does not scale as Dispatcher needs to know about all the States and needs frequent updating. I don't think this is a good paradigm, but could be wrong.
  4. Implement Default for MOF above: Now we can mem::replace it with the default while moving out the old value. The burden of panicking OR returning an error OR doing a NOP is now hidden in implementation of MOF. The problem here is we don't always have the access to MOF type and for those that we do, it again takes the point of bloat from user code to the code of MOF.
  5. Let the function handle_event take self by move as fn handle_event(mut self, ...) -> Option<Self>: Now instead of Rc<RefCell<>> you will need to have Box<State> and move it out each time in the dispatcher and if the return is Some you put it back. This almost feels like a sledgehammer and makes many other idioms impossible, for instance if I wanted to share self further in some registered closure/callback I would normally put a Weak<RefCell<>> previously but now sharing self in callbacks etc is impossible.

Are there any other options? Is there any that is considered the "most idiomatic" way of doing this in Rust?

1
What does Dispatcher even do? The only way I can make sense of this is that you have multiple state machines with different Uids and the dispatcher picks one to handle each event. Why do you keep all the "used" states in the HashMap -- will a state ever be re-entered? How?trentcl
All of the pain is because the overall technique is not panic-safe. In-between the time that you've moved out the old state and moved in the new state, the program could panic. If it did, the container of the state would be in an unknown condition, which means that running the destructor could have access to memory it shouldn't; breaking a core component of Rust's safety.Shepmaster
@trentcl Dispatcher is just the normal event queue mechanism. Consider something like mio tokens. So mio will give you the readiness event (readable, writable etc) and the token for which it was invoked. You map that token (Uid here) to your state implementation. There are no "used" states in HashMap. When the state transitions it removes itself - where do you see that in the code ?ustulation
@Shepmaster I couldn't fully grasp what you meant - what dtor will have access to memory it shouldn't ? Also there is no unsafe code here so what safety has been broken ? Also how do you suggest tackling this problem ?ustulation
The destructor of S0 will call the destructor of MOF. If you've moved MOF out of S0 but not yet replaced it, that destructor would have access to undefined memory. That's why the compiler disallows it. I'm not saying you have unsafety, just explaining why that field always has to be valid otherwise there would be unsafety. There's no universal answer. See the closed RFC for lots of discussionShepmaster

1 Answers

1
votes
  1. Let the function handle_event take self by move as fn handle_event(mut self, ...) -> Option<Self>: Now instead of Rc<RefCell<>> you will need to have Box<State> and move it out each time in the dispatcher and if the return is Some you put it back.

This is what I would do. However, you don't need to switch from Rc to Box if there is only one strong reference: Rc::try_unwrap can move out of an Rc.

Here's part of how you might rewrite Dispatcher:

struct Dispatcher {
    states: HashMap<Uid, Rc<State>>,
}
impl Dispatcher {
    fn dispatch(&mut self, state_id: Uid, event: Event) {
        if let Some(state_ref) = self.states.remove(&state_id) {
            let state = state_ref.try_unwrap()
                .expect("Unique strong reference required");
            if let Some(next_state) = state.handle_event(event) {
                self.states.insert(state_id, next_state);
            }
        } else {
            // handle state_id not found
        }
    }
}

(Note: dispatch takes state_id by value. In the original version, this wasn't necessary -- it could have been changed to pass by reference. In this version, it is necessary, since state_id gets passed to HashMap::insert. It looks like Uid is Copy though, so it makes little difference.)

It's not clear whether state_id actually needs to be a member of the struct that implements State anymore, since you don't need it inside handle_event -- all the insertion and removal happens inside impl Dispatcher, which makes sense and reduces coupling between State and Dispatcher.

impl State for S0 {
    fn handle_event(self, event: Event) -> Option<Rc<State>> {
        if event == Event::SomeEvent {
            // Do some work
            let next_state = Rc::new(S0 {
                state_id: self.state_id,
                move_only_field: self.mof,
            });
            Some(next_state)
        } else {
            // Do some other work
        }
    }
}

Now you don't have to handle a weird, should-be-impossible corner case where the Option is None.

This almost feels like a sledgehammer and makes many other idioms impossible, for instance if I wanted to share self further in some registered closure/callback I would normally put a Weak<RefCell<>> previously but now sharing self in callbacks etc is impossible.

Because you can move out of an Rc if you have the only strong reference, you don't have to sacrifice this technique.

"Feels like a sledgehammer" might be subjective, but to me, what a signature like fn handle_event(mut self, ...) -> Option<Self> does is encode an invariant. With the original version, each impl State for ... had to know when to insert and remove itself from the dispatcher, and whether it did or not was uncheckable. For example, if somewhere deep in the logic you forgot to call dispatcher.insert(state_id, next_state), the state machine wouldn't transition, and might get stuck or worse. When handle_event takes self by-value, that's not possible anymore -- you have to return the next state, or the code simply won't compile.

(Aside: both the original version and mine do at least two hashtable lookups each time dispatch is called: once to get the current state, and again to insert the new state. If you wanted to get rid of the second lookup, you could combine approaches: store Option<Rc<State>> in the HashMap, and take from the Option instead of removing it from the map entirely.)