0
votes

I'm very new to ANTLR4 and am trying to build my own language. So my grammar starts at

program: <EOF> | statement | functionDef | statement program | functionDef program;

and my statement is

statement: selectionStatement | compoundStatement | ...;

and

selectionStatement
:   If LeftParen expression RightParen compoundStatement (Else compoundStatement)?
|   Switch LeftParen expression RightParen compoundStatement
;

compoundStatement
: LeftBrace statement* RightBrace;

Now the problem is, that when I test a piece of code against selectionStatement or statement it passes the test, but when I test it against program it fails to recognize. Can anyone help me on this? Thank you very much


edit: the code I use to test is the following:

if (x == 2) {}

It passes the test against selectionStatement and statement but fails at program. It appears that program only accepts if...else

if (x == 2) {} else {}

Edit 2: The error message I received was

<unknown>: Incorrect error: no viable alternative at input 'if(x==2){}'
1
Still cannot answer based on incomplete information. What is the complete statement rule? What is the complete error message (the error listener puts out a lot more valuable information)? What is the complete token stream dump? What are your own questions and conclusions based on your own analysis of this information?GRosenberg

1 Answers

2
votes

Cannot answer your question given the incomplete information provided: the statement rule is partial and the compoundStatement rule is missing.

Nonetheless, there are two techniques you should be using to answer this kind of question yourself (in addition to unit tests).

First, ensure that the lexer is working as expected. This answer shows how to dump the token stream directly.

Second, use a custom ErrorListener to provide a meaningful/detailed description of its parse path to every encountered error. An example:

public class JavaErrorListener extends BaseErrorListener {

    public int lastError = -1;

    @Override
    public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine,
            String msg, RecognitionException e) {

        Parser parser = (Parser) recognizer;
        String name = parser.getSourceName();
        TokenStream tokens = parser.getInputStream();

        Token offSymbol = (Token) offendingSymbol;
        int thisError = offSymbol.getTokenIndex();
        if (offSymbol.getType() == -1 && thisError == tokens.size() - 1) {
            Log.debug(this, name + ": Incorrect error: " + msg);
            return;
        }
        String offSymName = JavaLexer.VOCABULARY.getSymbolicName(offSymbol.getType());

        List<String> stack = parser.getRuleInvocationStack();
        // Collections.reverse(stack);

        Log.error(this, name);
        Log.error(this, "Rule stack: " + stack);
        Log.error(this, "At line " + line + ":" + charPositionInLine + " at " + offSymName + ": " + msg);

        if (thisError > lastError + 10) {
            lastError = thisError - 10;
        }
        for (int idx = lastError + 1; idx <= thisError; idx++) {
            Token token = tokens.get(idx);
            if (token.getChannel() != Token.HIDDEN_CHANNEL) Log.error(this, token.toString());
        }
        lastError = thisError;
    }
}

Note: adjust the Log statements to whatever logging package you are using.

Finally, Antlr doesn't do 'weird' things - just things that you don't understand.