I'm building an AST using ANTLR and based on the separated Java6 lexer & grammar. The lexer definition is contained in Java6Lex.g and produces tokens the grammar consumes. The parser consumes these no problem, but as I produce the AST, I would like to introduce imaginary tokens - however, it seems that ANTLR doesn't like the model.
The parser grammar includes the token vocabulary from the lexer - which should baseline the tokens available to the grammar.
parser grammar Java6Parse;
options {
tokenVocab=Java6Lex;
backtrack=true;
memoize=true;
output=AST;
language = CSharp3;
}
Now let's say, I want to take fieldDeclaration and turn it into a rooted node using a rewrite rule. I assumed (clearly wrongly) that I could introduce the imaginary token directly into the parser grammar as follows:
fieldDeclaration
: modifiers type variableDeclarator (COMMA variableDeclarator)* SEMI
-> ^(FIELD modifiers type variableDeclarator+)
;
However, this simply results in the following error occurring:
reference to undefined token in rewrite rule: FIELD
No problem, I get that, I didn't define it. So, I try to define it in the tokens section in the parser grammar. Again, thinking wrongly, that the tokenVocab should provide a baseline.
tokens { FIELD; }
Nope, seems that even defining an tokens block results in an EarlyExitException and an error indicating that Java6Parse.g has no rules. I figured, the parser grammar simply doesn't like tokens being defined in the parser. So, I defined it in the lexer. Again, that failed. Then I defined every token in both the lexer and parser - again, failure.
So, here's what I need to know. Is there a way to define an imaginary token when the lexer and parser are separated and if so how. If not, is the only option to combine the grammar and lexer back into the same file?