9
votes

I've been trying to get my head around the Rust borrowing and ownership model.

Suppose we have the following code:

fn main() {
    let a = String::from("short");
    {
        let b = String::from("a long long long string");
        println!("{}", min(&a, &b));
    }
}

fn min<'a>(a: &'a str, b: &'a str) -> &'a str {
    if a.len() < b.len() {
        return a;
    } else {
        return b;
    }
}

min() just returns a reference to the shorter of the two referenced strings. main() passes in two string references whose referents are defined in different scopes. I've used String::from() so that the references don't have a static lifetime. The program correctly prints short. Here is the example in the Rust Playground.

If we refer to the Rustonomicon (which I appreciate is a work in progress doc), we are told that the meaning of a function signature like:

fn as_str<'a>(data: &'a u32) -> &'a str

means the function:

takes a reference to a u32 with some lifetime, and promises that it can produce a reference to a str that can live just as long.

Now let's turn to the signature of min() from my example:

fn min<'a>(a: &'a str, b: &'a str) -> &'a str

This is more invloved, since:

  • We have two input references.
  • Their referents are defined in different scopes meaning that they are valid for different lifetimes (a is valid longer).

Using similar wording to the quoted statement above, what does the function signature of min() mean?

  1. The function accepts two references and promises to produce a reference to a str that can live as long as the referents of a and b? That feels wrong somehow, as if we return the reference to b from min(), then clearly that reference is not valid for the lifetime of a in main().

  2. The function accepts two references and promises to produce a reference to a str that can live as long as the shorter of the two referents of a and b? That could work, since both referents of a and b remain valid in the inner scope of main().

  3. Something else entirely?

To summarise, I don't understand what it means to bind the lifetimes of the two input references of min() to the same lifetime when their referents are defined in different scopes in the caller.

3

3 Answers

4
votes

It's (2): the returned reference lives as long as the shorter input lifetime.

However, from the perspective of the function, both input lifetimes are in fact the same (both being 'a). So given that the variable a from main() clearly lives longer than b, how does this work?

The trick is that the caller shortens the lifetime of one of the two references to match min()s function signature. If you have a reference &'x T, you can convert it to &'y T iff 'x outlives 'y (also written: 'x: 'y). This makes intuitive sense (we can shorten the lifetime of a reference without bad consequences). The compiler performs this conversion automatically. So imagine that the compiler turns your main() into:

let a = String::from("short");
{
    let b = String::from("a long long long string");

    // NOTE: this syntax is not valid Rust! 
    let a_ref: &'a_in_main str = &a;
    let b_ref: &'b_in_main str = &b;
    println!("{}", min(&a as &'b_in_main str, &b));
    //                    ^^^^^^^^^^^^^^^^^^
}

This has to do with something called subtyping and you can read more about this in this excellent answer.

To summarize: the caller shortens one lifetime to match the function signature such that the function can just assume both references have the same lifetime.

2
votes

I'm going to go for (3) something else!

With your function signature:

fn min<'a>(a: &'a str, b: &'a str) -> &'a str { ...}

// ...
min(&a, &b)

the 'a is not the lifetime of the objects being borrowed. It is a new lifetime generated by the compiler just for this call. a and b will be borrowed (or possibly reborrowed) for as long as needed for the call, extended by the scope of the return value (since it references the same 'a).

Some examples:

let mut a = String::from("short");
{
    let mut b = String::from("a long long long string");
    // a and b borrowed for the duration of the println!()
    println!("{}", min(&a, &b));
    // a and b borrowed for the duration of the expression, but not
    // later (since l is not a reference)
    let l = min(&a, &b).len();

    {
        // borrowed for s's scope
        let s = min(&a, &b);
        // Invalid: b is borrowed until s goes out of scope
        // b += "...";
    }
    b += "...";  // Ok: b is no longer borrowed.
    // Borrow a and b again to print:
    println!("{}", min(&a, &b));
}

As you can see, the 'a for any individual call is distinct from the lifetime of the actual a and b which are borrowed, though of course both must outlive the generated lifetime of each call.

(Playground)

1
votes

Apart from what @Lukas have mentioned in the answer, you can also read the signature of the function as - The returned reference is valid till the point where both the passed references are valid i.e its an conjunction (aka AND) between the parameters lifetime.

There is something more to it. Below are two code examples:

    let a = String::from("short");
    {
        let c: &str;
        let b = String::from("a long long long string");
        c = min(&a, &b);

    } 

AND

let a = String::from("short");
    {
        let b = String::from("a long long long string");
        let c: &str;
        c = min(&a, &b);

    }

The first one doesn't work (second one does). It may seem that both b and c have same lifetime as they are in same scope but the ordering in the scope also matters as in the first case b lifetime will end before c.