How do I convert this Antlr3 AST to Antlr4?

Question

I'm trying to convert my existing Antlr3 project to Antlr4 to get more functionality. I have this grammar that wouldn't compile with Antlr4.9

expr        
    : term ( OR^ term )* ;

and

factor 
    : ava | NOT^ factor | (LPAREN! expr RPAREN!) ;

Mostly because Antlr4 doesn't support ^ and ! anymore. From the documentation it seems like those are

AST root operator. When generating abstract syntax trees (ASTs), token references suffixed with the "^" root operator force AST nodes to be created and added as the root of the current tree. This symbol is only effective when the buildAST option is set. More information about ASTs is also available.

AST exclude operator. When generating abstract syntax trees, token references suffixed with the "!" exclude operator are not included in the AST constructed for that rule. Rule references can also be suffixed with the exclude operator, which implies that, while the tree for the referenced rule is constructed, it is not linked into the tree for the referencing rule. This symbol is only effective when the buildAST option is set. More information about ASTs is also available.

If I took those out it would compile but I'm not sure what do those mean and how would Antlr4 supports it.

LPAREN and RPAREN is tokens

tokens {
    EQUALS = '=';
    LPAREN = '(';
    RPAREN = ')';
}

which Antlr4 kindly provides the way to convert that in the error messages but not ^ and !. The grammar is for parsing boolean expression for example (a=b AND b=c)

I think this is the rule

targetingexpr returns [boolean value]
    : expr { $value = $expr.value; } ;

expr returns [boolean value]
    : ^(NOT a=expr) { $value = !a; }        
    | ^(AND a=expr b=expr) { $value = a && b; }
    | ^(OR a=expr b=expr) { $value = a || b; }
    | ^(EQUALS A=ALPHANUM B=ALPHANUM) { $value = targetingContext.contains($A.text,$B.text); }
    ;

Bart Kiers Bart Kiers · Accepted Answer · 2020-12-20T19:34:11

The v3 grammar:

...

tokens {
    EQUALS = '=';
    LPAREN = '(';
    RPAREN = ')';
}

...

expr        
    : term ( OR^ term )* ;

factor 
    : ava | NOT^ factor | (LPAREN! expr RPAREN!) ;

in v4 would look like this:

...

expr        
    : term ( OR term )* ;

factor 
    : ava | NOT factor | (LPAREN expr RPAREN) ;

EQUALS : '=';
LPAREN : '(';
RPAREN : ')';

So, just remove the inline ^ and ! operators (tree rewriting is no longer available in ANTLR4), and move the literal tokens in the tokens { ... } sections into own lexer rules.

I think this is the rule

targetingexpr returns [boolean value]
    : expr { $value = $expr.value; } ;

expr returns [boolean value]
    : ^(NOT a=expr) { $value = !a; }        
    | ^(AND a=expr b=expr) { $value = a && b; }
    | ^(OR a=expr b=expr) { $value = a || b; }
    | ^(EQUALS A=ALPHANUM B=ALPHANUM) { $value = targetingContext.contains($A.text,$B.text); }
    ;

What you posted there is part of a tree grammar for which there is no equivalent. In ANTLR4 you'd use a visitor to evaluate your expressions instead of inside a tree grammar.

How do I convert this Antlr3 AST to Antlr4?

1 Answers