I'm working on a lexer in Rust.
Desired API:
enum Token<'input> {
Name(&'input str)
}
let mut lexicon = LexiconBuilder::<Token>::new()
.token("[a-zA-Z]+", |s| Token::Name(s))
// among others
.build();
let mut lexer = Lexer::new(lexicon, "input");
The idea is the user can provide a regular expression, along with a closure that gets run when the regex matches the input text. However, I'm having trouble proving to the lifetime checker that the slice that gets passed into token()'s closure lives long enough. From my POV, it seems safe, as the tokens aren't returned until you provide a string.
I've spent quite awhile trying to thread input lifetime through all of the types, however I can't ever seem to prove that the lexicon's (ergo, the rule's handler) lifetime will match/dominate the input's.
lexicon()in particular stands out because there's almost never any reason to have a function with an output lifetime but no input lifetimes. I imagine there is some reason why you're not just doing this, but I figured I'd just mention it anyway. - trentclnext()method return the rule ID, the current slice, and the position. The library is meant to be substrate for the user's lexer, hence being parameterized by the token type. - Matt Green