Hand-over-hand locking with Rust

Question

I'm trying to write an implementation of union-find in Rust. This is famously very simple to implement in languages like C, while still having a complex run time analysis.

I'm having trouble getting Rust's mutex semantics to allow iterative hand-over-hand locking.

Here's how I got where I am now.

First, this is a very simple implementation of part of the structure I want in C:

#include <stdlib.h>

struct node {
  struct node * parent;
};

struct node * create(struct node * parent) {
  struct node * ans = malloc(sizeof(struct node));
  ans->parent = parent;
  return ans;
}

struct node * find_root(struct node * x) {
  while (x->parent) {
    x = x->parent;
  }
  return x;
}

int main() {
  struct node * foo = create(NULL);
  struct node * bar = create(foo);
  struct node * baz = create(bar);
  baz->parent = find_root(bar);
}

Note that the structure of the pointers is that of an inverted tree; multiple pointers may point at a single location, and there are no cycles.

At this point, there is no path compression.

Here is a Rust translation. I chose to use Rust's reference-counted pointer type to support the inverted tree type I referenced above.

Note that this implementation is much more verbose, possibly due to the increased safety that Rust offers, but possibly due to my inexperience with Rust.

use std::rc::Rc;

struct Node {
    parent: Option<Rc<Node>>
}

fn create(parent: Option<Rc<Node>>) -> Node {
    Node {parent: parent.clone()}
}

fn find_root(x: Rc<Node>) -> Rc<Node> {
    let mut ans = x.clone();
    while ans.parent.is_some() {
        ans = ans.parent.clone().unwrap();
    }
    ans
}

fn main() {
    let foo = Rc::new(create(None));
    let bar = Rc::new(create(Some(foo.clone())));
    let mut prebaz = create(Some(bar.clone()));
    prebaz.parent = Some(find_root(bar.clone()));
}

Path compression re-parents each node along a path to the root every time find_root is called. To add this feature to the C code, only two new small functions are needed:

void change_root(struct node * x, struct node * root) {
  while (x) {
    struct node * tmp = x->parent;
    x->parent = root;
    x = tmp;
  }
}

struct node * root(struct node * x) {
  struct node * ans = find_root(x);
  change_root(x, ans);
  return ans;
}

The function change_root does all the re-parenting, while the function root is just a wrapper to use the results of find_root to re-parent the nodes on the path to the root.

In order to do this in Rust, I decided I would have to use a Mutex rather than just a reference counted pointer, since the Rc interface only allows mutable access by copy-on-write when more than one pointer to the item is live. As a result, all of the code would have to change. Before even getting to the path compression part, I got hung up on find_root:

use std::sync::{Mutex,Arc};

struct Node {
    parent: Option<Arc<Mutex<Node>>>
}

fn create(parent: Option<Arc<Mutex<Node>>>) -> Node {
    Node {parent: parent.clone()}
}

fn find_root(x: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
    let mut ans = x.clone();
    let mut inner = ans.lock();
    while inner.parent.is_some() {
        ans = inner.parent.clone().unwrap();
        inner = ans.lock();
    }
    ans.clone()
}

This produces the error (with 0.12.0)

error: cannot assign to `ans` because it is borrowed
ans = inner.parent.clone().unwrap();

note: borrow of `ans` occurs here
let mut inner = ans.lock();

What I think I need here is hand-over-hand locking. For the path A -> B -> C -> ..., I need to lock A, lock B, unlock A, lock C, unlock B, ... Of course, I could keep all of the locks open: lock A, lock B, lock C, ... unlock C, unlock B, unlock A, but this seems inefficient.

However, Mutex does not offer unlock, and uses RAII instead. How can I achieve hand-over-hand locking in Rust without being able to directly call unlock?

EDIT: As the comments noted, I could use Rc<RefCell<Node>> rather than Arc<Mutex<Node>>. Doing so leads to the same compiler error.

For clarity about what I'm trying to avoid by using hand-over-hand locking, here is a RefCell version that compiles but used space linear in the length of the path.

fn find_root(x: Rc<RefCell<Node>>) -> Rc<RefCell<Node>> {
    let mut inner : RefMut<Node> = x.borrow_mut();
    if inner.parent.is_some() {
        find_root(inner.parent.clone().unwrap())
    } else {
        x.clone()
    }
}

No, a Mutex is almost certainly the wrong choice here. If you stay within a single task, as you probably should (by default), mutating something behind an Rc is best done with Rc<Cell<T>> (if T is Copy) or Rc<RefCell<T>> (if it's not). — user395760
I agree that a Cell or RefCell is the right avenue here. You have a method that looks immutable (find), but wants to mutate the data behind-the-scenes. That's a prime case for the Cell family. — Shepmaster
When using Rc<RefCell<Node>> in place of Arc<Mutex<Node>> (and borrow_mut in place of lock), I get exactly the same error: "error: cannot assign to ans because it is borrowed . . ." — jbapple

reem reem · Accepted Answer · 2014-12-18T09:03:28

We can pretty easily do full hand-over-hand locking as we traverse this list using just a bit of unsafe, which is necessary to tell the borrow checker a small bit of insight that we are aware of, but that it can't know.

But first, let's clearly formulate the problem:

We want to traverse a linked list whose nodes are stored as Arc<Mutex<Node>> to get the last node in the list
We need to lock each node in the list as we go along the way such that another concurrent traversal has to follow strictly behind us and cannot muck with our progress.

Before we get into the nitty-gritty details, let's try to write the signature for this function:

fn find_root(node: Arc<Mutex<Node>>) -> Arc<Mutex<Node>>;

Now that we know our goal, we can start to get into the implementation - here's a first attempt:

fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
    // We have to separate this from incoming since the lock must
    // be borrowed from incoming, not this local node.  
    let mut node = incoming.clone();
    let mut lock = incoming.lock();

    // Could use while let but that leads to borrowing issues.
    while lock.parent.is_some() {
       node = lock.parent.as_ref().unwrap().clone(); // !! uh-oh !!
       lock = node.lock();
    }

    node
}

If we try to compile this, rustc will error on the line marked !! uh-oh !!, telling us that we can't move out of node while lock still exists, since lock is borrowing node. This is not a spurious error! The data in lock might go away as soon as node does - it's only because we know that we can keep the data lock is pointing to valid and in the same memory location even if we move node that we can fix this.

The key insight here is that the lifetime of data contained within an Arc is dynamic, and it is hard for the borrow checker to make the inferences we can about exactly how long data inside an Arc is valid.

This happens every once in a while when writing rust; you have more knowledge about the lifetime and organization of your data than rustc, and you want to be able to express that knowledge to the compiler, effectively saying "trust me". Enter: unsafe - our way of telling the compiler that we know more than it, and it should allow us to inform it of the guarantees that we know but it doesn't.

In this case, the guarantee is pretty simple - we are going to replace node while lock still exists, but we are not going to ensure that the data inside lock continues to be valid even though node goes away. To express this guarantee we can use mem::transmute, a function which allows us to reinterpret the type of any variable, by just using it to change the lifetime of the lock returned by node to be slightly longer than it actually is.

To make sure we keep our promise, we are going to use another handoff variable to hold node while we reassign lock - even though this moves node (changing its address) and the borrow checker will be angry at us, we know it's ok since lock doesn't point at node, it points at data inside of node, whose address (in this case, since it's behind an Arc) will not change.

Before we get to the solution, it's important to note that the trick we are using here is only valid because we are using an Arc. The borrow checker is warning us of a possibly serious error - if the Mutex was held inline and not in an Arc, this error would be a correct prevention of a use-after-free, where the MutexGuard held in lock would attempt to unlock a Mutex which has already been dropped, or at least moved to another memory location.

use std::mem;
use std::sync::{Arc, Mutex};

fn find_root(incoming: Arc<Mutex<Node>>) -> Arc<Mutex<Node>> {
    let mut node = incoming.clone();
    let mut handoff_node;
    let mut lock = incoming.lock();

    // Could use while let but that leads to borrowing issues.
    while lock.parent.is_some() {
       // Keep the data in node around by holding on to this `Arc`.
       handoff_node = node;

       node = lock.parent.as_ref().unwrap().clone();

       // We are going to move out of node while this lock is still around,
       // but since we kept the data around it's ok.
       lock = unsafe { mem::transmute(node.lock()) };
    }

    node
}

And, just like that, rustc is happy, and we have hand-over-hand locking, since the last lock is released only after we have acquired the new lock!

There is one unanswered question in this implementation which I have not yet received an answer too, which is whether the drop of the old value and assignment of a new value to a variable is a guaranteed to be atomic - if not, there is a race condition where the old lock is released before the new lock is acquired in the assignment of lock. It's pretty trivial to work around this by just having another holdover_lock variable and moving the old lock into it before reassigning, then dropping it after reassigning lock.

Hopefully this fully addresses your question and shows how unsafe can be used to work around "deficiencies" in the borrow checker when you really do know more. I would still like to want that the cases where you know more than the borrow checker are rare, and transmuting lifetimes is not "usual" behavior.

Using Mutex in this way, as you can see, is pretty complex and you have to deal with many, many, possible sources of a race condition and I may not even have caught all of them! Unless you really need this structure to be accessible from many threads, it would probably be best to just use Rc and RefCell, if you need it, as this makes things much easier.

Hand-over-hand locking with Rust

6 Answers