0
votes

I have tried to write a grammar to recognize expressions like:

(A + MAX(B) ) / ( C - AVERAGE(A) )
IF( A > AVERAGE(A), 0, 1 ) 
X / (MAX(X)

Unfortunately antlr3 fails with these errors:

error(210): The following sets of rules are mutually left-recursive [unaryExpression, additiveExpression, primaryExpression, formula, multiplicativeExpression]

error(211): DerivedKeywords.g:110:13: [fatal] rule booleanTerm has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.

error(206): DerivedKeywords.g:110:13: Alternative 1: after matching input such as decision cannot predict what comes next due to recursion overflow to additiveExpression from formula

I have spent some hours trying to fix these, it would be great if anyone could at least help me fix the first problem. Thanks

Code:

    grammar DerivedKeywords;
options {
output=AST;
//backtrack=true;
}
WS    : (    ' ' | '\t' | '\n' | '\r' )
            { skip(); }
      ;

//for numbers
DIGIT
      :     '0'..'9'
      ;
//for both integer and real number
NUMBER
      :     (DIGIT)+ ( '.' (DIGIT)+ )?( ('E'|'e')('+'|'-')?(DIGIT)+ )?
      ;

// Boolean operatos
AND : 'AND';
OR : 'OR';
NOT : 'NOT';
EQ : '=';
NEQ : '!=';
GT : '>';
LT : '<';
GTE : '>=';
LTE : '<=';
COMMA : ',';

// Token for Functions
IF : 'IF';
MAX : 'MAX';
MIN : 'MIN';
AVERAGE : 'AVERAGE';


VARIABLE :   'A'..'Z' ('A'..'Z' | '0'..'9')*
         ;
// OPERATORS

LPAREN      :     '('   ;     
RPAREN            :     ')'   ;

DIV         :     '/'   ;     
PLUS              :     '+'         ;
MINUS       :     '-'   ;     
STAR              :     '*'         ;


expression : formula;

formula 
        :   functionExpression
        |   additiveExpression 
        |   LPAREN! a=formula RPAREN! // First Problem
        ;

additiveExpression 
      : a=multiplicativeExpression ( (MINUS^ | PLUS^ )   b=multiplicativeExpression )* 
      ;


multiplicativeExpression
      : a=unaryExpression ( (STAR^ | DIV^ ) b=unaryExpression )* 
        
      ;

unaryExpression
      :     MINUS^ u=unaryExpression 
      |     primaryExpression 
      ;

functionExpression 
        : f=functionOperator LPAREN e=formula RPAREN 
        |  IF LPAREN b=booleanExpression COMMA p=formula COMMA s=formula RPAREN
        ; 


functionOperator : 
    MAX | MIN | AVERAGE;


primaryExpression
      :     NUMBER 
      // Used for scientific numbers
      |     DIGIT 
      |     VARIABLE 
      |     formula
    ;

// Boolean stuff
booleanExpression 
            : orExpression;

orExpression :  a=andExpression (OR^ b=andExpression )*
                ;
andExpression 
            :   a=notExpression (AND^ b=notExpression )*
                ;

notExpression
            : NOT^ t=booleanTerm 
            | booleanTerm
            ;

booleanOperator :
    GT | LT | EQ | GTE | LTE | NEQ; 

booleanTerm : a=formula op=booleanOperator b=formula                 
            | LPAREN! booleanTerm RPAREN! // Second problem
;
1

1 Answers

4
votes

error(210): The following sets of rules are mutually left-recursive [unaryExpression, additiveExpression, primaryExpression, formula, multiplicativeExpression]

- this means that if the parser enters unaryExpression rule, it has the possibility to match additiveExpression, primaryExpression, formula, multiplicativeExpression and unaryExpression again without ever consuming a single token from input - so it cannot decide whether to use those rules or not, because even if it uses the rules, the input will be the same.

You're probably trying to allow subexpressions in expressions by this sequence of rules - you need to make sure that path will consume the left parenthesis of the subexpression. Probably the formula alternative in primaryExpression should be changed to LPAREN formula RPAREN, and the rest of grammar be adjusted accordingly.