Access Channels in ANTLR 4 and Parse them separately

Question

I have included my comments in to a separate channel in ANTLR 4. In my case it is channel 2.

This is my lexer grammar.

COMMENT: '/*' .*? '*/' -> channel(2) 
       ;

I want to access this channel 2 and do a parse on this channel to accumulate comments. So I included that as in parsing grammar as below:

comment
:COMMENT
;

In the program

        string s = " paring string"
        AntlrInputStream input = new AntlrInputStream(s);
        CSSLexer lexer = new CSSLexer(input); 
       
        CommonTokenStream tokens = new CommonTokenStream(lexer,2);

Then I want to do the parsing on the tokens

var xr = parser.comment().GetRuleContexts<CommentContext>();

because I want to get the information from the CommentContext object such as Start.Column etc.

EDIT:

This is the improved question

To be more specific, I want to get all the tokens in channel 2 and parse them using comment grammar to get all the comments to a list(IReadOnly<CommentContext>) so that I can iterate through each of these and access the information such as, start line, start column, end line end column, and the token text.

CommonTokenStream tokens = new CommonTokenStream(lexer,2);

This is not giving me the tokens in channel 2. And another thing I discovered is until these tokens are passed as arguments to the parser construct XParser parser = new XParser(tokens);

Then only I can access the the tokens by calling GetTokens().In the tokes I can see that there are comments identified as tokens and is in the channel 2. Even though CommentTokenStrem species the channel number as above. it contains all the tokens.

What is the reason of not able to access the tokens until the parser object is created using the tokens?
I want to get a CommentTokenStrem in channel 2 and pass the to the XParser object creation to parse these tokens using my comment grammar. What is the best way of doing this in ANTLR 4 API?

Sam Harwell Sam Harwell · Accepted Answer · 2013-09-05T12:26:13

CommonTokenStream internally tracks all tokens from any channel. The only thing you won't see when you call getTokens() is lexer rules where a -> skip action was executed (no token is even created for those rules).

You can look at the tokens on channel 2 by using the TokenStream.LT and IntStream.consume methods.

Java example

CommonTokenStream cts = new CommonTokenStream(tokenSource, 2);
List<Token> tokens = new ArrayList<Token>();
while (cts.LA(1) != EOF) {
    tokens.add(cts.LT(1));
    cts.consume();
}

C# example:

CommonTokenStream cts = new CommonTokenStream(tokenSource, 2);
IList<IToken> tokens = new List<IToken>();
while (cts.La(1) != Eof)
{
    tokens.Add(cts.Lt(1));
    cts.Consume();
}

Access Channels in ANTLR 4 and Parse them separately

4 Answers