I found out how. It might not be the best approach but it certainly seems to be working.
- Antlr parsers receive a
ITokenStream
parameter
- Antlr lexers are themselves
ITokenSource
s
ITokenSource
is a significantly simpler interface than ITokenStream
- The simplest way to convert a
ITokenSource
to a ITokenStream
is to use a CommonSourceStream
, which receives a ITokenSource
parameter
So now we only need to do 2 things:
- Adjust the grammar to be parser-only
- Implement ITokenSource
Adjusting the grammar is very simple. Simply remove all lexer declarations and ensure you declare the grammar as parser grammar
. A simple example is posted here for convinience:
parser grammar mygrammar;
options
{
language=CSharp2;
}
@parser::namespace { MyNamespace }
document: (WORD {Console.WriteLine($WORD.text);} |
NUMBER {Console.WriteLine($NUMBER.text);})*;
Note that the following file will output class mygrammar
instead of class mygrammarParser
.
So now we want to implement a "fake" lexer.
I personally used the following pseudo-code:
TokenQueue q = new TokenQueue();
//Do normal lexer stuff and output to q
CommonTokenStream cts = new CommonTokenStream(q);
mygrammar g = new mygrammar(cts);
g.document();
Finally, we need to define TokenQueue
. TokenQueue
is not strictly necessary but I used it for convenience.
It should have methods to receive the lexer tokens, and methods to output Antlr tokens. So if not using Antlr native tokens one has to implement a convert-to-Antlr-token method.
Also, TokenQueue
must implement ITokenSource
.
Be aware that it is very important to correctly set the token variables. Initially, I had some problems because I was miscalculating CharPositionInLine
. If these variables are incorrectly set, then the parser may fail.
Also, the normal channel(not hidden) is 0.
This seems to be working for me so far. I hope others find it useful as well.
I'm open to feedback. In particular, if you find a better way to solve this problem, feel free to post a separate reply.