0
votes

I'm currently working on a simple ANTLR4 grammar for evaluating mathematical expressions. At the moment, my grammar should just be able to parse simple operations like multiplications, divisions, additions and subtractions ... Here's my grammar:

grammar WRB;

options {
language = Java;
}

prog: stat+;

stat: expr SEPARATOR #printExpr
    | ID ASSIGN expr SEPARATOR #assignment
    ;

expr: expr op=(MUL|DIV) expr #punkt
    | expr op=(ADD|SUB) expr #strich
    | num #number
    | (SIGN)? ID #ref
    | '(' expr ')' #klammer
    ;

ID  :   [a-zA-Z]+;
DIGITS :   [0-9]+ ;

ASSIGN: '=';
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';

integer: (SIGN)? DIGITS;
floating:  (integer)? '.' DIGITS;
num:  (integer | floating);
SIGN: '+' | '-';

SEPARATOR: ';';
WS: [ \t\r\n]+ -> skip ;

Everything works fine besides the negative numbers. Here's the syntax tree for the sample "-4 + 9":

enter image description here

I'm fairly new to language recognition and grammars. I don't see why ANTLR handles the negative sign as extraneous input, shouldn't the expr rule dive into the #number alternative?

Thanks in advance.

1
Run grun in "tokens" mode to see how your input is being lexed. "gui" mode doesn't reveal that. Also it's helpful to completely separate all your parser rules from all your lexer rules. Intermingling them can lead to (programmer) confusion.TomServo

1 Answers

0
votes

Without testing: try removing SIGN rule, rewrite integer as (SUB|ADD)? DIGITS. My understanding is that SIGN will never match because it follows SUB and ADD. Token rules always follow "first longest match wins", there is no attempt to rematch for "better parsing".