Resolving Lexer and Parser ambiguities in ANTLR4

Question

In ANTLR4 I have a lexer rule that says that I can get any word using any character but spaces and line breaks. It is defined as this:

WORD : ~[ \t\r\n:,]+;

I also have a lexer rule (defined before than WORD) for going to an EVAL mode:

OPENEVAL : '${' -> pushMode(EVAL);

mode EVAL;
CLOSEEVAL : '}' -> popMode;
... (more lexer definitions for EVAL mode) ...

In the parser file I'm trying to detect a grammar rule OR a word. So I do the following:

eval : evaluation
     | WORD;

evaluation : OPENEVAL somestuff CLOSEEVAL;

somestuff uses lexer rules defined in the EVAL mode. The problem is, when evaluating the eval rule, it identifies the text as a WORD token, and not as a evalution grammar rule. I mean, if I enter some text like:

${stuff to be evaluated}

It should go to the evaluation rule, but instead, it identifies it as a WORD (taking the "${stuff" part only)

I know that there is an ambiguity between evaluation and WORD, but I thought that ANTLR was going to take the first coincidence of the parser rule (evaluation in this case).

Sorry if this is too confusing, I tried to summarize this as good as possible (I didn't want to put the full parser and lexer contents to avoid a wall of text basically).

Another option I considered was to define "WORD" as anything but text surrounded by ${ and }. But I don't know how to create such a lexer rule.

How could I solve this? To distinguish between evaluation and WORD?

Sam Harwell Sam Harwell · Accepted Answer · 2014-01-06T13:30:22

You need to include a predicate preventing the inclusion of $ in a WORD when its followed by {.

WORD
  : ( ~[ \t\r\n:,$]
    | '$' {_input.LA(1) != '{'}?
    )+
  ;

Resolving Lexer and Parser ambiguities in ANTLR4

1 Answers