1
votes

I'm looking for a way to make certain specific Tokens case-insensitive in my ANLTR parser. What I already tried:

  1. Converting my input to all lowercase. This didn't work, for some parts of my grammar require case sensitivity.

  2. Saving the uppercase and lowercase version of the tokens. This didn't work either cause this way my lexer file became too large (breaking the 64k limit of ANTLR).

What I hope exists is some regex trick or maybe an ANTLR flag that tells the parser to treat certain tokens differently.


An example:

SENSITIVETOKEN
:
    'footoken' 
;

INSENSITIVETOKEN
:
    'bootoken'  (some magic here)
;

The lexer should recognize "BOOTOKEN" as a INSENSITIVETOKEN
but not "FOOTOKEN" as SENSITIVETOKEN


Thanks for your help! ^^

1

1 Answers

3
votes

One possible solution could be to declare one-letter fragments and construct tokens based on those fragments.

Example:

INSENSITIVETOKEN
:
    B O O T O K E N
;

fragment B: ('B'|'b');
fragment O: ('O'|'o');
fragment T: ('T'|'t');
fragment K: ('K'|'k');
fragment E: ('E'|'e');
fragment N: ('N'|'n');

Or, if there're not many case-insensitive tokens, simply:

INSENSITIVETOKEN
:
    ('B'|'b')('O'|'o')('O'|'o')('T'|'t')('O'|'o')('K'|'k')('E'|'e')('N'|'n')
;