36
votes

I'm creating my first grammar with ANTLR and ANTLRWorks 2. I have mostly finished the grammar itself (it recognizes the code written in the described language and builds correct parse trees), but I haven't started anything beyond that.

What worries me is that every first occurrence of a token in a parser rule is underlined with a yellow squiggle saying "Implicit token definition in parser rule".

For example, in this rule, the 'var' has that squiggle:

variableDeclaration: 'var' IDENTIFIER ('=' expression)?;

How it looks exactly:

enter image description here

The odd thing is that ANTLR itself doesn't seem to mind these rules (when doing test rig test, I can't see any of these warning in the parser generator output, just something about incorrect Java version being installed on my machine), so it's just ANTLRWorks complaining.

Is it something to worry about or should I ignore these warnings? Should I declare all the tokens explicitly in lexer rules? Most exaples in the official bible The Defintive ANTLR Reference seem to be done exactly the way I write the code.

2

2 Answers

30
votes

I highly recommend correcting all instances of this warning in code of any importance.

This warning was created (by me actually) to alert you to situations like the following:

shiftExpr : ID (('<<' | '>>') ID)?;

Since ANTLR 4 encourages action code be written in separate files in the target language instead of embedding them directly in the grammar, it's important to be able to distinguish between << and >>. If tokens were not explicitly created for these operators, they will be assigned arbitrary types and no named constants will be available for referencing them.

This warning also helps avoid the following problematic situations:

  • A parser rule contains a misspelled token reference. Without the warning, this could lead to silent creation of an additional token that may never be matched.
  • A parser rule contains an unintentional token reference, such as the following:

    number : zero | INTEGER;
    zero   : '0'; // <-- this implicit definition causes 0 to get its own token
    
2
votes

If you're writing lexer grammar which wouldn't be used across multiple parser grammmar(s) then you can ignore this warning shown by ANTLRWorks2.