0
votes

ANTLR 4.5 is giving me a "mismatched input 'String[]' expecting 'String'" but I don't understand why the '[]' are being included in the token.

I have stripped the grammar down to the bare minimum to show the problem:

grammar Test;    
@header
{
package parser;
}

mainClass : 'class' ID '{' 'void' 'main' '(' 'String' '[' ']' ID ')' '}' ;

ID : [a-zA-Z] [a-zA-z0-9]* ;

WS : [ \t\f\r\n]+ -> channel(HIDDEN);

The input is:

class A
{
    void main(String[] args)
}

If I use 'String []' then the input is successfully parsed.

If I print out the tokens from the parse tree then they all look like what I expect, except for 'String[]' being shown as one ID token and not 3 separate tokens.

I have tried explicitly defining the 'String', '[' and ']' tokens but the result is the same.

I just can't work out what is wrong.

2
Only thing that comes to my mind is that your lexer rules are fighting each other. Try to convert them into named rules, not just literal rules and try to play with the order of the rules.Divisadero

2 Answers

0
votes

Try defining lexer rule for each of your tokens. I wonder how you could generate your parser with this grammar. ANTLR usually spits out a message that it cannot implicitly generate lexer tokens from literals.

0
votes

Oh the dangers of seeing what you expect to be there.

I changed all tokens to be explicit with the same result.

I then changed the ID lexer rule to use 2 fragments, one for letter and one for digit and it worked!

I changed to back to the original rule and it still worked.

A very careful check of the version posted in this question showed that the problem was in the uppercase range in the ID lexer rule. It had A-z but should have been A-Z. The range A-z includes \[]^_' hence the problem.