1
votes

I have a grammar in ANTLR4 around which I am writing an application. A snippet of the pertinent grammar is shown below:

grammar SomeGrammar;
// ... a bunch of other parse rules
operand
   : id | literal ;
id
   : ID ;
literal
   : LITERAL ;
// A bunch of other lexer rules
LITERAL       : NUMBER | BOOLEAN | STRING;
NUMBER        : INTEGER | FLOAT ;
INTEGER       : [0-9]+ ;
FLOAT         : INTEGER '.' INTEGER | '.' INTEGER ;
BOOLEAN       : 'TRUE' | 'FALSE' ;
ID            : [A-Za-z]+[A-Za-z0-9_]* ;
STRING        : '"' .*? '"' ;

I generate the antlr4 JavaScript Lexer and Parser like so:

$ antlr4 -o . -Dlanguage=JavaScript -listener -visitor

and then I overload the exitLiteral () prototype to check if an operand is a literal. The issue is that if I pass

a

it (force) parses it to a literal, and throws an error (e.g. below shown with grun):

$ grun YARL literal -gui -tree
a
line 1:0 mismatched input 'a' expecting LITERAL
(literal a)

The same error when I use the JavaScript Parser which I overloaded like so:

SomeGrammarLiteralPrinter.prototype.exitLiteral = function (ctx) {
    debug ("Literal is " + ctx.getText ()); // Literal is a
    };

I would like to catch the error so that I can decide that it is an ID, and not a LITERAL. How do I do that?

Any help is appreciated.

1
Why not just use operand? - sepp2k
@sepp2k: I specifically need to know if it is a literal or operand, but I get your point. I think I could check with the same lexing rules in my application and use the operand like you said, but I was wondering if there is an antlr4 parser way. - Sonny
I don't have much experience with ANTLR4, but you'll know that based on which listener/visitor method will be called, no? - sepp2k
@sepp2k, I think you are right about that too. I need to go back to the drawing board to understand how I am using the listeners. Thanks again! - Sonny

1 Answers

1
votes

Better solution is to adjust the grammar so that it accurately describes the intended syntax to begin with:

startRule : ruleA ruleB EOF ;
ruleA     : something operand anotherthing ;
ruleB     : id assign literal  ;

operand   : ID | LITERAL ;
id        : ID ;
literal   : LITERAL ;

The parser performs a top-down graph evaluation of the parser rules, starting with the startRule. That is, the parser will evaluate the listed startRule elements in order, sequentially descending through the named sub-rules (and just those sub-rules). Consequently, ruleA will not encounter/consider the id and literal rules.

In this limited example then, there is no conflict in the seemingly overlapping definition of the operand, id, and literal rules.

Update

The OperandContext class will contain ID() and LITERAL() methods returning TerminalNode. The one that does not return null represents the symbol that was actually matched in that specific context. Look at the generated code.