11
votes

Standard Cell struct provides interior mutability but allows only a few mutation methods such as set(), swap() and replace(). All of these methods change the whole content of the Cell. However, sometimes more specific manipulations are needed, for example, to change only a part of data contained in the Cell.

So I tried to implement some kind of universal Cell, allowing arbitrary data manipulation. The manipulation is represented by user-defined closure that accepts a single argument - &mut reference to the interior data of the Cell, so the user itself can deside what to do with the Cell interior. The code below demonstrates the idea:

use std::cell::UnsafeCell;

struct MtCell<Data>{
    dcell: UnsafeCell<Data>,
}

impl<Data> MtCell<Data>{
    fn new(d: Data) -> MtCell<Data> {
        return MtCell{dcell: UnsafeCell::new(d)};
    }

    fn exec<F, RetType>(&self, func: F) -> RetType where
        RetType: Copy,
        F: Fn(&mut Data) -> RetType 
    {
        let p = self.dcell.get();
        let pd: &mut Data;
        unsafe{ pd = &mut *p; }
        return func(pd);
    }
}

// test:

type MyCell = MtCell<usize>;

fn main(){
    let c: MyCell = MyCell::new(5);
    println!("initial state: {}", c.exec(|pd| {return *pd;}));
    println!("state changed to {}", c.exec(|pd| {
        *pd += 10; // modify the interior "in place"
       return *pd;
    }));
}

However, I have some concerns regarding the code.

  1. Is it safe, i.e can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell? I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time. But I still have doubts (I'm new to Rust).

  2. Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?

Note: I posted the question here (not on code review) as it seems more related to the language rather than code itself (which represents just a concept).

[EDIT] I'd want zero cost abstraction without possibility of runtime failures, so RefCell is not perfect solution.

2
What you're attempting to implement already exists in the Rust standard library and it's called RefCell and you can get mutable refs out of it using the borrow_mut method.pretzelhammer
@pretzelhamme RefCell has runtime check (and some runtime overhead) and may cause panic, it is not exactly what I want to achieveuser396672
The runtime check is what ensures the memory safety of the abstraction.pretzelhammer
RefCell panics if you use it wrong, yes. In the cases where RefCell panics, Cell would fail to compile, and UnsafeCell would compile but cause undefined behavior. If you encounter panics with RefCell, it's because your data access patterns are unsound -- you can't fix that by using a different abstraction. (But you're right that it does have a (tiny) runtime overhead)trentcl
@trentcl I agree, runtime overhead in RefCell is very tiny (obviously, just some kind of "busy" flag), but panic seems more serious, it can even kill Arian rocket :). One can say panic is caused by a violation of RefCell contract, but compile-time safety is one of the main language features why people choose Rust. In C++, we also can avoid UB by respecting all the language contracts, although we need to read the whole C++ standard and keep the whole standard in mind forever :). Finally, I agree, RefCell contract is easier to remember, but it seems as not the finest part of the language.user396672

2 Answers

8
votes

This is a very common pitfall for Rust beginners.

  1. Is it safe, i.e can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell? I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time. But I still have doubts (I'm new to Rust).

In a word, no.

Playground

fn main() {
    let mt_cell = MtCell::new(123i8);

    mt_cell.exec(|ref1: &mut i8| {
        mt_cell.exec(|ref2: &mut i8| {
            println!("Double mutable ref!: {:?} {:?}", ref1, ref2);
        })
    })
}

You're absolutely right that the reference cannot be used outside of the closure, but inside the closure, all bets are off! In fact, pretty much any operation (read or write) on the cell within the closure is undefined behavior (UB), and may cause corruption/crashes anywhere in your program.

  1. Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?

Using Cell is often not the best technique, but it's impossible to know what the best solution is without knowing more about the problem.

If you insist on Cell, there are safe ways to do this. The unstable (ie. beta) Cell::update() method is literally implemented with the following code (when T: Copy):

pub fn update<F>(&self, f: F) -> T
where
    F: FnOnce(T) -> T,
{
    let old = self.get();
    let new = f(old);
    self.set(new);
    new
}

Or you could use Cell::get_mut(), but I guess that defeats the whole purpose of Cell.

However, usually the best way to change only part of a Cell is by breaking it up into separate Cells. For example, instead of Cell<(i8, i8, i8)>, use (Cell<i8>, Cell<i8>, Cell<i8>).

Still, IMO, Cell is rarely the best solution. Interior mutability is a common design in C and many other languages, but it is somewhat more rare in Rust, at least via shared references and Cell, for a number of reasons (e.g. it's not Sync, and in general people don't expect interior mutability without &mut). Ask yourself why you are using Cell and if it is really impossible to reorganize your code to use normal &mut references.

IMO the bottom line is actually about safety: if no matter what you do, the compiler complains and it seems that you need to use unsafe, then I guarantee you that 99% of the time either:

  1. There's a safe (but possibly complex/unintuitive) way to do it, or
  2. It's actually undefined behavior (like in this case).

EDIT: Frxstrem's answer also has better info about when to use Cell/RefCell.

7
votes

Your code is not safe, since you can call c.exec inside c.exec to get two mutable references to the cell contents, as demonstrated by this snippet containing only safe code:

let c: MyCell = MyCell::new(5);
c.exec(|n| {
    // need `RefCell` to access mutable reference from within `Fn` closure
    let n = RefCell::new(n);

    c.exec(|m| {
        let n = &mut *n.borrow_mut();
        
        // now `n` and `m` are mutable references to the same data, despite using
        // no unsafe code. this is BAD!
    })
})

In fact, this is exactly the reason why we have both Cell and RefCell:

  • Cell only allows you to get and set a value and does not allow you to get a mutable reference from an immutable one (thus avoiding the above issue), but it does not have any runtime cost.
  • RefCell allows you to get a mutable reference from an immutable one, but needs to perform checks at runtime to ensure that this is safe.

As far as I know, there's not really any safe way around this, so you need to make a choice in your code between no runtime cost but less flexibility, and more flexibility but with a small runtime cost.