Some keywords (string constant) in my grammar contain capital letters e.g.
PREV_VALUE : 'PreviousValue';
This causes strange parsing behavior: other tokens that contain same capital letters ('P','V') are parsed incorrectly.
Here's a simplified version of the lexer grammar:
lexer grammar ExpressionLexer;
COMMA : ',';
LPAREN : '(';
RPAREN : ')';
LBRACK : '[';
RBRACK : ']';
PLUS : '+';
MINUS : '-';
MULT : '*';
DIV : '/';
PREV_VALUE : 'PreviousValue';
fragment DIGIT : ('0'..'9');
fragment LETTER : ('a'..'z'|'A'..'Z'|'_');
fragment TAB : ('\t') ;
fragment NEWLINE : ('\r'|'\n') ;
fragment SPACE : (' ') ;
When I try parsing such expression:
var expression = "P"; //Capital 'P' which included to the keyword 'PreviousValue'
var stringReader = new StringReader(expression);
var input = new ANTLRReaderStream(stringReader);
var expressionLexer = new ExpressionLexer(input);
var tokens = new CommonTokenStream(expressionLexer);
tokens._tokens
collection contains one value
[0] = {[@0,1:1='<EOF>',<-1>,1:1]}
It's incorrect.
If I change expression
to 'p' (lowercase letter)
tokens._tokens
collection contains two values
[0] = {[@0,0:0='p',<0>,1:0]}
[1] = {[@1,1:1='<EOF>',<-1>,1:1]}
It's correct.
When string PREV_VALUE : 'PreviousValue';
is removed from grammar, both expressions are parsed correctly.
Is it possible to use different case in keywords? Is there any example of using such keywords in ANTLR grammar?