4
votes

Consider the following example from the Book:

fn main() {
    let string1 = String::from("abcd");
    let string2 = "xyz";

    let result = longest(string1.as_str(), string2);
    println!("The longest string is {}", result);
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

It is said that (emphasis mine)

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a. In practice, it means that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in. These constraints are what we want Rust to enforce.

Shouldn't the bolded sentence be The function signature also tells Rust that the string slice returned from the function will live at most as long as lifetime 'a.? That way, we are assured that as long as both x and y are alive, then the return value would also be valid, because the latter references the former.

To paraphrase, if x, y and the return value all live at least as long as lifetime 'a, then the compiler can simply let 'a be an empty scope (which any item can outlive) to satisfy the restriction, rendering the annotation useless. This doesn't make sense, right?

2
No, it should live at least, because if it lives more it is still a valid lifetime.Netwave
@Netwave but it won’t be valid it it lives too long, e.g. outliving both x and y.nalzok
aaah, aha, ok. I misunderstood. In that case both x and y are bound by the same lifetime so the returning reference could live at most as you say. Makes sense yes.Netwave
As a consequence, the caller is guaranteed that he can safely use the returned value so long as both x and y are still valid.Jmb
Expressed in formal language, the annotation translates to: for all 'a, 'a≤'x and 'a≤'y implies 'a≤'r (with 'x, 'y and 'r the lifetimes of x, y, and the return value respectively). For that relation to hold for all 'a, then you must necessarily have 'x≤'r or 'y≤'rJmb

2 Answers

2
votes

Expressed in formal language, the annotation translates to:

for all 'a, 'a≤'x and 'a≤'y implies 'a≤'r

With 'x, 'y and 'r the lifetimes of x, y, and the return value respectively.

This links the lifetime of the return value to the lifetimes of the parameters because for that relation to hold for all 'a, then you must necessarily have 'x≤'r or 'y≤'r.

The compiler will use that annotation at two times:

  1. When compiling the annotated function, the compiler doesn't know the actual lifetimes of x and y and it doesn't know 'a (since 'a will be chosen at the call site, like all generic parameters). But it knows that when the function gets called, the caller will use some lifetime 'a that matches the input constraints 'a≤'x and 'a≤'y and it checks that the code of the function respects the output constraint 'a≤'r.

  2. When calling the annotated function, the compiler will add to its constraint solver an unknown scope 'a in which the return value can be accessed, along with the constraints that 'a≤'x and 'a≤'y plus whatever extra constraints are required due to the surrounding code and in particular where x and y come from and how the return value is used. If the compiler is able to find some scope 'a that matches all the constraints, then the code compiles using that scope. Otherwise compilation fails with a "does not live long enough" error.

1
votes

We can consider the case from your example code with a slight scope modification

fn main() {
    let string1 = String::from("abcd");

    {
        let string2 = "xyz";
        let result = longest(string1.as_str(), string2);
        println!("The longest string is {}", result);
    }
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Here, we recognize that for the function call longest above, the lifetime a ends up being the lifetime of string2, because both parameters x and y must live at least as long as a, so if a were the lifetime of string1, then the second parameter to longest, which is just string2 would not live as long as string1 and the statement "both parameters must live at least as long as a" would be false.

We admit that lifetime a is the lifetime of string2. We know that the string slice returned by longest could be either string1 or string2. Since we make the constraint in the declaration that the return value also lives at least as long as lifetime a, we are really saying that the return value lives at least as long as string2, the string with the shorter of the two lifetimes.

If longest returned string2, then the returned string slice would live exactly as long as lifetime a. If longest returned string1, however, the returned string slice would live as long as the lifetime of string1, which is longer than that of lifetime a (the lifetime of string2), so we say that the string slice returned from the function will live at least as long as a.

An important thing to note here is that we don't know which slice longest is going to return, so we only allow the lifetime of the returned reference to be that of the smaller of the two lifetimes, since during the smaller of the two lifetimes, both strings are certainly still alive.