I'm using ANTLR4 to create a parse tree for my grammar, what I want to do is modify certain nodes in the tree. This will include removing certain nodes and inserting new ones. The purpose behind this is optimization for the language I am writing. I have yet to find a solution to this problem. What would be the best way to go about this?
3 Answers
While there is currently no real support or tools for tree rewriting, it is very possible to do. It's not even that painful.
The ParseTreeListener
or your MyBaseListener
can be used with a ParseTreeWalker
to walk your parse tree.
From here, you can remove nodes with ParserRuleContext.removeLastChild()
, however when doing this, you have to watch out for ParseTreeWalker.walk
:
public void walk(ParseTreeListener listener, ParseTree t) {
if ( t instanceof ErrorNode) {
listener.visitErrorNode((ErrorNode)t);
return;
}
else if ( t instanceof TerminalNode) {
listener.visitTerminal((TerminalNode)t);
return;
}
RuleNode r = (RuleNode)t;
enterRule(listener, r);
int n = r.getChildCount();
for (int i = 0; i<n; i++) {
walk(listener, r.getChild(i));
}
exitRule(listener, r);
}
You must replace removed nodes with something if the walker has visited parents of those nodes, I usually pick empty ParseRuleContext
objects (this is because of the cached value of n
in the method above). This prevents the ParseTreeWalker
from throwing a NPE.
When adding nodes, make sure to set the mutable parent on the ParseRuleContext
to the new parent. Also, because of the cached n
in the method above, a good strategy is to detect where the changes need to be before you hit where you want your changes to go in the walk
, so the ParseTreeWalker
will walk over them in the same pass (other wise you might need multiple passes...)
Your pseudo code should look like this:
public void enterRewriteTarget(@NotNull MyParser.RewriteTargetContext ctx){
if(shouldRewrite(ctx)){
ArrayList<ParseTree> nodesReplaced = replaceNodes(ctx);
addChildTo(ctx, createNewParentFor(nodesReplaced));
}
}
I've used this method to write a transpiler that compiled a synchronous internal language into asynchronous javascript. It was pretty painful.
Another approach would be to write a ParseTreeVisitor
that converts the tree back to a string. (This can be trivial in some cases, because you are only calling TerminalNode.getText()
and concatenate in aggregateResult(..)
.)
You then add the modifications to this visitor so that the resulting string representation contains the modifications you try to achieve.
Then parse the string and you get a parse tree with the desired modifications.
This is certainly hackish in some ways, since you parse the string twice. On the other hand the solution does not rely on antlr implementation details.
I needed something similar for simple transformations. I ended up using a ParseTreeWalker
and a custom ...BaseListener
where I overwrote the enter...
methods. Inside this method the ParserRuleContext.children
is available and can be manipulated.
class MyListener extends ...BaseListener {
@Override
public void enter...(...Context ctx) {
super.enter...(ctx);
ctx.children.add(...);
}
}
new ParseTreeWalker().walk(new MyListener(), parseTree);