2
votes

I am quite fresh with Rust. I have experience mainly in C and C++.

This code from lol_html crate example works.

use lol_html::{element, HtmlRewriter, Settings};

let mut output = vec![];

{
    let mut rewriter = HtmlRewriter::try_new(
        Settings {
            element_content_handlers: vec![
                // Rewrite insecure hyperlinks
                element!("a[href]", |el| {
                    let href = el
                        .get_attribute("href")
                        .unwrap()
                        .replace("http:", "https:");

                    el.set_attribute("href", &href).unwrap();

                    Ok(())
                })
            ],
            ..Settings::default()
        },
        |c: &[u8]| output.extend_from_slice(c)
    ).unwrap();

    rewriter.write(b"<div><a href=").unwrap();
    rewriter.write(b"http://example.com>").unwrap();
    rewriter.write(b"</a></div>").unwrap();
    rewriter.end().unwrap();
}

assert_eq!(
    String::from_utf8(output).unwrap(),
    r#"<div><a href="https://example.com"></a></div>"#
);

But if I move element_content_handlers vec outside and assign it, I get

temporary value dropped while borrowed

for the let line:

use lol_html::{element, HtmlRewriter, Settings};

let mut output = vec![];

{
    let handlers = vec![
                // Rewrite insecure hyperlinks
                element!("a[href]", |el| {
                    let href = el
                        .get_attribute("href")
                        .unwrap()
                        .replace("http:", "https:");

                    el.set_attribute("href", &href).unwrap();

                    Ok(())
                }) // this element is deemed temporary
            ];

    let mut rewriter = HtmlRewriter::try_new(
        Settings {
            element_content_handlers: handlers,
            ..Settings::default()
        },
        |c: &[u8]| output.extend_from_slice(c)
    ).unwrap();

    rewriter.write(b"<div><a href=").unwrap();
    rewriter.write(b"http://example.com>").unwrap();
    rewriter.write(b"</a></div>").unwrap();
    rewriter.end().unwrap();
}

assert_eq!(
    String::from_utf8(output).unwrap(),
    r#"<div><a href="https://example.com"></a></div>"#
);

I think that the method takes ownership of the vector, but I don't understand why it does not work with the simple assignment. I don't want to let declare all elements first. I expect that there is a simple idiom to make it own all elements.

EDIT: Compiler proposed to bind the element before the line, but what if I have a lot of elements? I would like to avoid naming 50 elements for example. Is there a way to do this without binding all the elements? Also why the lifetime of the temporary ends there inside of vec! invocation in case of a let binding, but not when I put the vec! inside newly constructed struct passed to a method? The last question is very important to me.

3

3 Answers

2
votes

When I first tried to reproduce your issue, I got that try_new didn't exist. It's been removed in the latest version of lol_html. Replacing it with new, your issue didn't reproduce. I was able to reproduce with v0.2.0, though. Since the issue had to do with code generated by macros, I tried cargo expand (something you need to install, see here).

Here's what let handlers = ... expanded to in v0.2.0:

let handlers = <[_]>::into_vec(box [(
    &"a[href]".parse::<::lol_html::Selector>().unwrap(),
    ::lol_html::ElementContentHandlers::default().element(|el| {
        let href = el.get_attribute("href").unwrap().replace("http:", "https:");
        el.set_attribute("href", &href).unwrap();
        Ok(())
    }),
)]);

and here's what it expands to in v0.3.0

let handlers = <[_]>::into_vec(box [(
    ::std::borrow::Cow::Owned("a[href]".parse::<::lol_html::Selector>().unwrap()),
    ::lol_html::ElementContentHandlers::default().element(|el| {
        let href = el.get_attribute("href").unwrap().replace("http:", "https:");
        el.set_attribute("href", &href).unwrap();
        Ok(())
    }),
)]);

Ignore the first line, it's how the macro vec! expands. The second line shows the difference in what the versions generate. The first takes a borrow of the result of parse, the second takes a Cow::Owned of it. (Cow stands for copy on write, but it's more generally useful for anything where you want to be generic over either the borrowed or owned version of something.).

So the short answer is the macro used to expand to something that wasn't owned, and now it does. As for why it worked without a separate assignment, that's because Rust automatically created a temporary variable for you.

When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead, except if promoted to a static

https://doc.rust-lang.org/reference/expressions.html#tempora...

Initially rust created multiple temporaries for you, all valid for the same-ish scope, the scope of the call to try_new. When you break out the vector to its own assignment the temporary created for element! is only valid for the scope of the vector assignment.

I took a look at the git blame for the element! macro in lol_html, and they made the change because someone opened an issue with essentially your problem. So I'd say this is a bug in a leaky abstraction, not an issue with your understanding of rust.

0
votes

You are creating a temporary value inside the vector (element). This means that the value created inside of the vector only exists for that fleeting lifetime inside of the vector. At the end of the vector declaration, that value is freed, meaning that it no longer exists. This means that the value created inside vec![] only exists for that fleeting lifetime inside of vec![]. At the end of vec![], the value is freed, meaning that it no longer exists:

let handlers = vec![
 ______
|  
|    element!("a[href]", |el| {
|        let href = el.get_attribute("href").unwrap().replace("http:", |"https:");
|        el.set_attribute("href", &href).unwrap();
|        Ok(())
|    }),
|______ ^ This value is temporary
]; > the element is freed here, it no longer exists!

You then try to create a HtmlRewriter using a non-existent value!

Settings {
    element_content_handlers: handlers,
    // the element inside of `handlers` doesn't exist anymore!
    ..Settings::default()
},

Obviously, the borrow checker catches this issue, and your code doesn't compile.

The solution here is to bind that element to a variable with let:

let element = element!("a[href]", |el| {
    let href = el.get_attribute("href").unwrap().replace("http:", "https:");
    el.set_attribute("href", &href).unwrap();
    Ok(())
});

And then create the vector:

let handlers = vec![element];

Now, the value is bound to a variable (element), and so it lives long enough to be borrowed later in HtmlRewriter::try_new

0
votes

When you create something, it gets bound to the innermost scope possible for the purposes of tracking its lifetime. Using a let binding at a higher scope binds the value to that scope, making its lifetime longer. If you're creating a lot of things, then applying an operation to them (for example, passing them to another function), it often makes sense to create a vector of values and then apply a transformation to them instead. As an example,

let xs = (0..10).map(|n| SomeStruct { n }).map(|s| another_function(s)).collect();

This way you don't need to bind the SomeStruct objects to anything explicitly.