9
votes

I am using Antlr4 and java7 grammar (source) for modifying an input Java Source file. More specifically, I am using the TokenStreamRewriter class to modify some tokens. The following code is a sample that shows how the tokens are modified:

public class TestListener extends JavaBaseListener {    
   private TokenStreamRewriter rewriter;
   rewriter = new TokenStreamRewriter(tokenStream);
   rewriter.replace(ctx.getStart(), ctx.getStop(), "someText");
}

When I print the altered source code, the white spaces and tabs are removed and the new source file's format is like this:

importjava.util.ArrayList;publicclassMain{publicstaticvoidmain(String[]args{MyTimertimer=newMyTimer();}}

I am using extractor.getText() for printing it back.

Is this a problem of the grammar used or should I use some other method from the TokenStreamRewriter class?

1

1 Answers

24
votes

The issue is that the lexer is not sending white space to the parser, which means that the rewrite stream doesn't have access to the tokens either. It is because of the skip lexer command:

WS : [ \t\r\n\u000C]+ -> skip ;

You have to change all those to -> channel(HIDDEN) which will send them to the parser on a different channel, making them available in the token stream, but invisible to the parser.