ANTLR: required (...)+ loop did not match anything at character

Question

I'm getting:

line n:m required (...)+ loop did not match anything at character u'#'

But the parser finishes with 0 = parser.getNumberOfSyntaxErrors(), and produces an AST that is correct. Further checks confirm that the error message is being printed by the lexer which throws an antlr3.exceptions.EarlyExitException that somehow never reaches the parser.

The lexical rule that should match at that point is:

LOCALVAR
    :
    '#' NAME_CHAR+ 
    ;

And the point of failure in the input reads #I).

Why doe the lexing and the parsing succeed? Why the message on valid input?

The lexer probably recovers. Really hard to comment without being able to reproduce it. Can you post an SSCCE? — Bart Kiers
Try calling lexer.getNumberOfSyntaxErrors(). Maybe the lexer recovers as Bart suggests, but continues to count it as an error. — user1201210
This reply from Loring Cryamer on the (gone) lists was helpful: "...but 'X' is eliminated after looking at the next character. That leaves no alternatives, so the DFA reports failure and you get the "No viable alt" exception. That could be fixed--in theory, at least--but would require a lot of additional analysis to refine the DFA. [...] The nightmare case is differentiating keywords and identifiers; append one letter to a keyword, and it can no longer be recognized as keyword or identifier. To get around this, it is necessary to recognize keywords as special cases in identifier rules." — Apalala
IOW, ANTLR lexers do not behave as regexp lexers, but as LL(k) ones, just as the grammar. That makes them more powerful, but also less simple. — Apalala

Apalala Apalala · Accepted Answer · 2012-12-13T03:58:09

My solution to a similar problem:

DOT : '.' ;

INTEGER
    :
    DIGITS
    ;

FLOAT
    :
    (DIGITS DOT DIGITS)=> DIGITS DOT DIGITS
    ;

When parsing the following phrase:

#J := #X(75.W)

The lexer protests with:

line n:m required (...)+ loop did not match anything at character u'W'

Changing the rules to:

FLOAT
    :
    DIGITS 
    (
       (DOT DIGIT)=>  DOT DIGITS 
    |
        () { $type=INTEGER }
    )
    ;

Fixed the problem.

The issue is, in part, that ANTLR lexers are nor RE but LL.

ANTLR: required (...)+ loop did not match anything at character

1 Answers