Why is there such a difference between parse tree visualization and my visitor/listener traversal?

Question

I've created this sample grammar using the ANTLR4 plugin in IntelliJ and when I use its tool chain to generate a visual representation for some invalid content (in this case an empty string), this representation seems to differ from what I'm able to get when doing actual parse tree traversal using a sample visitor/listener implementation for same input.

This is the grammar:

grammar TestParser;

THIS            : 'this';

Identifier
    : [a-zA-Z0-9]+
    ;

WS  :  [ \t\r\n\u000C]+ -> skip;

parseExpression:
    expression EOF
;

expression
    : expression bop='.' (Identifier | THIS ) #DottedExpression
    | primary                                 #PrimaryExpression
    ;

primary
        : THIS                                #This
        | Identifier                          #PrimaryIdentifier
        ;

For an empty string, I get the following tree:

This tree indicates that the parser built a parse tree that contains "DottedExpression" and "primary:This" (assuming it uses its own visitor/listener implementation to do this). Yet when I attempt the same using the following code:

package org.example.so;

import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTreeWalker;

public class TestParser {
    public static void main(String[] args) {
        String input = "";
        TestParserLexer lexer = new TestParserLexer(CharStreams.fromString(input));
        CommonTokenStream tokenStream = new CommonTokenStream(lexer);
        TestParserParser parser = new TestParserParser(tokenStream);

        TestParserParser.ParseExpressionContext parseExpressionContext = parser.parseExpression();

        MyVisitor visitor = new MyVisitor();
        visitor.visit(parseExpressionContext);

        System.out.println("----------------");

        ParseTreeWalker walker = new ParseTreeWalker();
        walker.walk(new MyListener(), parseExpressionContext);

        System.out.println("----------------");
    }

    private static class MyVisitor extends TestParserBaseVisitor {

        @Override
        public Object visitParseExpression(TestParserParser.ParseExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitParseExpression(ctx);
        }

        @Override
        public Object visitDottedExpression(TestParserParser.DottedExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":DottedExpression");
            return super.visitDottedExpression(ctx);
        }

        @Override
        public Object visitPrimaryExpression(TestParserParser.PrimaryExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":PrimaryExpression");
            return super.visitPrimaryExpression(ctx);
        }

        @Override
        public Object visitThis(TestParserParser.ThisContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitThis(ctx);
        }

        @Override
        public Object visitPrimaryIdentifier(TestParserParser.PrimaryIdentifierContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitPrimaryIdentifier(ctx);
        }
    }

    private static  class MyListener extends TestParserBaseListener {

        @Override
        public void enterParseExpression(TestParserParser.ParseExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }

        @Override
        public void enterDottedExpression(TestParserParser.DottedExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":DottedExpression");
        }

        @Override
        public void enterPrimaryExpression(TestParserParser.PrimaryExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":PrimaryExpression");
        }

        @Override
        public void enterThis(TestParserParser.ThisContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }

        @Override
        public void enterPrimaryIdentifier(TestParserParser.PrimaryIdentifierContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }
    }
}

I get the following output:

line 1:0 mismatched input '<EOF>' expecting {'this', Identifier}
parseExpression
expression:PrimaryExpression
----------------
parseExpression
expression:PrimaryExpression
----------------

So, not only the tree depths do not match, the output even indicates a different rule was matched second ("PrimaryExpression" instead of "DottedExpression").

Why is there such a difference between what I'm shown and what I attempt to show? How do I create the same tree representation as shown by the plugin?

Using ANTLR version 4.7. Plugin version is 1.8.4.

Took a quick look at the ANTLR source. My guess is that the graphical parse tree is being created with the parser in a different prediction mode than you're getting when you run the parser yourself. — Gene
Incidentally, this appears to be the code that's generating the parse tree that's rendered graphically. Maybe you can look for clues in how the parser is set up here: github.com/antlr/antlr4/blob/… — Gene

MaanooAk MaanooAk · Accepted Answer · 2017-09-05T14:34:44

This issue was fixed at version 1.8.2 of the plugin. In case that you have version 1.8.2 or later, then you probably discovered an other unknown subcase of the issue.

However (based on the issue that I am referring to), trees differ only when the parsing is resulting to an error. So if you are not interested on using the errors' information, you should be fine.

Why is there such a difference between parse tree visualization and my visitor/listener traversal?

1 Answers