0
votes

I'm just getting into antlr grammars and I thought I had a pretty simple one. The problem I am running into is this [1+2] parses to

  • PLUS (+)
    • INT_LITERAL (1)
    • INT_LITERAL (2)

correctly which is good but [a1+2] also parses into

  • PLUS (+)
    • INT_LITERAL (1)
    • INT_LITERAL (2)

instead of giving me an error like I would expect.

Thanks in advance.

grammar MyExpressions;

options {
    language=CSharp3;
    TokenLabelType=CommonToken;
    ASTLabelType=CommonTree;
    output=AST;
    k=10;
}

@lexer::namespace{Expressions}
@parser::namespace{Expressions}


/*
 * Parser Rules
 */

public root: LBRACKET! expression^ RBRACKET!;
expression: binaryOperation;

binaryOperation: (term PLUS^ term);

term: INT_LITERAL;

/*
 * Lexer Rules
 */

PLUS: '+';
LBRACKET: '[';
RBRACKET: ']';  
INT_LITERAL: '1'..'9'+;
WS: ' ';

I fixed the issue by adding this to my grammar: @lexer::members { public override void DisplayRecognitionError(string[] tokenNames, RecognitionException e) { string hdr = GetErrorHeader(e); string msg = GetErrorMessage(e, tokenNames);

    throw new SyntaxException(hdr,msg);
    }
}

@parser::members {
    public override void DisplayRecognitionError(string[] tokenNames,
                                        RecognitionException e) {
        string hdr = GetErrorHeader(e);
        string msg = GetErrorMessage(e, tokenNames);

        throw new SyntaxException(hdr,msg);
    }
}

SyntaxException is a custom exception that I created for my application.

2

2 Answers

1
votes

I found out what was going on. When you use the @members construct it adds the code to the parser but not the lexer. You have to prefix it like this @lexer::members. Once I did that lexer generated correctly.

public override void DisplayRecognitionError(string[] tokenNames,
                                    RecognitionException e) {
    string hdr = GetErrorHeader(e);
    string msg = GetErrorMessage(e, tokenNames);
    // Now do something with hdr and msg...

    System.Console.WriteLine("Header:  " + hdr);
    System.Console.WriteLine("Message: " + msg);

    throw new System.Exception("Syntax Error: " + hdr + " " + msg);
}

I'm still a little thrown off because the NumberOfSyntaxErrors count still showed 0, but the grammar is no breaking like it should.

0
votes

'a' is an invalid character and so you should've gotten a tokenization error. The parser will not see it. Ter