In using ANTLR4, I keep coming back to the same problem -- how to implement algorithmic rule validation in the parser.
For example, I need the parser to validate the "year" part of a date written "month day year" before matching the rule. I've learned I can do this using a predicate as follows:
date :
{isYear(_input.LT(3).getText())}?
month day=INTEGER year=INTEGER { ... }
But this solution isn't general, since it depends on the rule month
always being one token long.
I thought I'd discovered a way around this problem by changing the rule to this:
date : month day=INTEGER yearInt { ... } ;
yearInt returns [int i]
: {isYear(_input.LT(1).getText())}?
yr=INTEGER { $i = $yr.int; }
;
Unfortunately, this grammar passes "July 11 6" as a date, even though isYear("6")
fails. When I trace through the ANTLR-generated code in XXParser.java for yearInt()
, I see it call
throw new FailedPredicateException(this, "isYear(_input.LT(1).getText())");
but the code then carries on and accepts yearInt()
anyway.
Is this an ANTLR bug, or my bug? Is there a "proper" way to write a grammar that needs to validate the parts of a rule?