You should include an explicit EOF
at the end of your entry rule any time you are trying to parse an entire input file. If you do not include the EOF
, it means you are not trying to parse the entire input, and it's acceptable to parse only a portion of the input if it means avoiding a syntax error.
For example, consider the following rule:
file : item*;
This rule means "Parse as many item
elements as possible, and then stop." In other words, this rule will never attempt to recover from a syntax error because it will always assume that the syntax error is part of some syntactic construct that's beyond the scope of the file
rule. Syntax errors will not even be reported, because the parser will simply stop.
If instead I had the following rule:
file : item* EOF;
In means "A file consists exactly of a sequence of zero-or-more item
elements." If a syntax error is reached while parsing an item
element, this rule will attempt to recover from (and report) the syntax error and continue because the EOF
is required and has not yet been reached.
For rules where you are only trying to parse a portion of the input, ANTLR 4 often works, but not always. The following issue describes a technical problem where ANTLR 4 does not always make the correct decision if the EOF
is omitted.
https://github.com/antlr/antlr4/issues/118
Unfortunately the performance impact of this change is substantial, so until that is resolved there will be edge cases that do not behave as you expect.