2
votes

In using ANTLR4, I keep coming back to the same problem -- how to implement algorithmic rule validation in the parser.

For example, I need the parser to validate the "year" part of a date written "month day year" before matching the rule. I've learned I can do this using a predicate as follows:

date :
    {isYear(_input.LT(3).getText())}?
        month  day=INTEGER  year=INTEGER     { ... }

But this solution isn't general, since it depends on the rule month always being one token long.

I thought I'd discovered a way around this problem by changing the rule to this:

date :  month  day=INTEGER  yearInt     { ... } ;

yearInt returns [int i]
     :   {isYear(_input.LT(1).getText())}?
             yr=INTEGER                 { $i = $yr.int; }
     ;

Unfortunately, this grammar passes "July 11 6" as a date, even though isYear("6") fails. When I trace through the ANTLR-generated code in XXParser.java for yearInt(), I see it call

throw new FailedPredicateException(this, "isYear(_input.LT(1).getText())");

but the code then carries on and accepts yearInt() anyway.

Is this an ANTLR bug, or my bug? Is there a "proper" way to write a grammar that needs to validate the parts of a rule?

1

1 Answers

1
votes

Try

date :
    month  day=INTEGER  year=INTEGER {isYear($year)}?<fail="A sensible error msg"> { ... }
;

or

date :
    month  day=INTEGER  year=INTEGER {if ( ! isYear($year) ) 
                                       notifyErrorListeners("A sensible error msg");
                                     }     
    { ... }
;

One of these will produce more sensible error messages. notifyErrorListeners() establishes the error, but lets the parse "succeed" as far as the ongoing parse is concerned. {isYear($year)}? will fail and do more looking for a match.

I confess that I haven't actually tried this code. Maybe you need $year.text and I'm not sure if the fail option and notifyErrorListeners() are valid in the C# version as well as the Java version.

George