I'm re-learning some basic Antlr and trying to write a grammar to generate todo items:
Meeting at 12pm for 20 minutes
The issue I'm having is that three lexer rules in particular are getting "mismatched" depending on the context in which they're used:
HOUR: [0-9]|'1'[0-9]|'2'[0-3];
MINUTE: [0-5][0-9];
NONZERO_NUMBER: [1-9][0-9]*;
There are some cases in which I want 12
to match the HOUR
rule, and other times when I want it to match MINUTE
, etc., but the parser rules don't seem to be able to influence the lexer to be context-sensitive.
For example, the string above (Read Books...
) does not parse, because while the 12
is matched as an HOUR
, so is the 20
, and the parser is expecting NONZERO_NUMBER
so fails.
line 1:20 mismatched input '20' expecting NONZERO_NUMBER
If I change the duration value to intentionally not match the HOUR
rule, it's fine:
Meeting at 12pm for 120 minutes // Note 120 minutes doesn't match HOUR or MINUTE
Is there any way to "convince" the lexer to try to match the expected token (as defined for the parser) before trying other/earlier rules?
Here's my full grammar for clarity:
Sidenote: I realize there are other oddities, like an event name can only be a single word, but I'm tackling one problem at a time.
grammar Sprint;
event: eventName timePhrase? durationPhrase?;
durationPhrase: 'for' duration;
timePhrase: 'at' time;
duration: (NONZERO_NUMBER MINUTE_STR) | (NONZERO_NUMBER HOUR_STR);
time: ((HOUR ':' MINUTE) | (HOUR)) AMPM?;
eventName: WORD;
MINUTE_STR: 'minute'('s')?;
HOUR_STR: 'hour'('s')?;
HOUR: [0-9]|'1'[0-9]|'2'[0-3];
MINUTE: [0-5][0-9];
NONZERO_NUMBER: [1-9][0-9]*;
AMPM: ('A'|'a'|'P'|'p')('M'|'m');
WORD: ('a'..'z' | 'A'..'Z')+;
WS: (' '|[\n\t\r]) -> skip;