I am trying to port an existing grammar developed for an unknown tool to Antlr. There is a use case in the grammar where there are two tokens such as TEXT and TEXT_WITHOUT_A Some rules in the grammar should allow only text without a, but the rest is OK with using text.
My initial attempts produced the following grammar, but the problem is, Antlr matches the more specific grammar rule (txtwa) when txt is actually a superset of it. If I enter something like 'sometextwth' that does not contain a, Antlr does not follow the rule for text (txt) The expected input is txt, and the provided input matches is, but Antlr figures out that the input matches txtwa and even if it is not expected at that point in grammar, chooses not to use txt.
/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/
expr : ( txt)* ;
txt : TEXT ;
txtwa : LETTERS_MINUS_A;
term : factor ( (MULT | DIV) factor)*;
factor : NUMBER;
/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/
NUMBER : (DIGIT)+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n' | '\u000C')+ {$channel = HIDDEN;} ;
fragment LETTER_MINUS_A : ('b'..'z' | 'B'..'Z');
fragment LETTER : ('a'..'z' | 'A'..'Z');
fragment DIGIT : '0'..'9' ;
LETTERS_MINUS_A
: LETTER_MINUS_A (LETTER_MINUS_A)*;
TEXT : LETTER (LETTER)* ;
I'd like to use txt freely without having to do (txt | txtwa) , which works btw. What am I missing here?