Switch CommonTokenStream to ignore or enable Whitespace

Question

My original grammar uses the skip command to ignore whitespaces in the parsing process.

WS      :   [ \t]+ ->  skip ;

However for refactoring methods I need to send whitespace tokens to a hidden channel to use the TokenStreamRewriter according to this receipe: ANTLR4: TokenStreamRewriter output doesn't have proper format (removes whitespaces)

WS      :   [ \t]+ ->  channel(HIDDEN);

The problem is now that the parser recognizes whitespaces as tokens which I want to avoid in the default parsing process.

Is it possible to switch between two different implementations of the same rule dependent on the regular parsing process or the parsing process for refactoring methods (with the same grammar)?

Do I need semantic predicates for this? Or is there a method available in the CommonTokenStream to skip or enable whitespacces?

Mike Cargal Mike Cargal · Accepted Answer · 2016-01-19T21:12:46

I'm not really sure what is causing your problem. Your expected behavior is correct.

WS [ \t]+ -> channel(HIDDEN)

will move those tokens to a channel that is not processed by the parser. You do not need semantic predicates, or any special calls on CommonTokenStream to make this happen.

This is what I do in my grammar and WS is not seen by the parser (I have a slightly different WS rule, but nothing that should make a difference).

The lexer (aka tokenizer) runs independently of the parser (and before the parser), so the parser can't do anything to impact how the lexer does it's job (for example, which channel a token is placed on).

You may also want to take a look at the following method on your TokenStream:

public List<Token> getTokens(int start, int stop, int ttype)

With that method you can pull out a list of your comment tokens within the start and stop token indices, by supplying the token type of your comment token as the third parameter.

Switch CommonTokenStream to ignore or enable Whitespace

1 Answers