0
votes

Should lexer rules be unambiguous in Antlr4?

Suppose I would like to parse dates and defined

hour: DIGIT09 | (DIGIT1 DIGIT09) | (DIGIT2 DIGIT04);

month: DIGIT19 | (DIGIT1 DIGIT02);

DIGIT12: '1'..'2';

DIGIT1: '1';

DIGIT2: '2';

DIGIT19: '1'..'9';

DIGIT09: '0'..'9';

DIGIT04: '0'..'4';

DIGIT04: '0'..'2';

Here I defined digit ranges in lexer. But looks like it doesn't work, since they are ambiguous.

Can I define ranges in parser instead of lexer?

1

1 Answers

1
votes

This type of validation is best performed in a listener or visitor which executes after a parse tree is created. Start with just a number:

NUMBER : [0-9]+;

Then define hour and month based on this:

hour : NUMBER;
month : NUMBER;

After you have a parse tree, implement enterHour and enterMonth to validate that the NUMBER contained in each is valid.

This approach yields the best combination of error recovery and error reporting in the event the user enters incorrect input.