Beginner: ANTLR4 Grammar doesn't handle negative numbers

Question

I'm currently working on a simple ANTLR4 grammar for evaluating mathematical expressions. At the moment, my grammar should just be able to parse simple operations like multiplications, divisions, additions and subtractions ... Here's my grammar:

grammar WRB;

options {
language = Java;
}

prog: stat+;

stat: expr SEPARATOR #printExpr
    | ID ASSIGN expr SEPARATOR #assignment
    ;

expr: expr op=(MUL|DIV) expr #punkt
    | expr op=(ADD|SUB) expr #strich
    | num #number
    | (SIGN)? ID #ref
    | '(' expr ')' #klammer
    ;

ID  :   [a-zA-Z]+;
DIGITS :   [0-9]+ ;

ASSIGN: '=';
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';

integer: (SIGN)? DIGITS;
floating:  (integer)? '.' DIGITS;
num:  (integer | floating);
SIGN: '+' | '-';

SEPARATOR: ';';
WS: [ \t\r\n]+ -> skip ;

Everything works fine besides the negative numbers. Here's the syntax tree for the sample "-4 + 9":

I'm fairly new to language recognition and grammars. I don't see why ANTLR handles the negative sign as extraneous input, shouldn't the expr rule dive into the #number alternative?

Thanks in advance.

Run grun in "tokens" mode to see how your input is being lexed. "gui" mode doesn't reveal that. Also it's helpful to completely separate all your parser rules from all your lexer rules. Intermingling them can lead to (programmer) confusion. — TomServo

doublep doublep · Accepted Answer · 2019-10-24T19:39:10

Without testing: try removing SIGN rule, rewrite integer as (SUB|ADD)? DIGITS. My understanding is that SIGN will never match because it follows SUB and ADD. Token rules always follow "first longest match wins", there is no attempt to rematch for "better parsing".

Beginner: ANTLR4 Grammar doesn't handle negative numbers

1 Answers