0
votes

I would like to parse two type of expression with boolean :
- the first would be an init expression with boolean like : init : false
- and the last one would be a derive expression with boolean like : derive : !express or (express and (amount >= 100))

My idea is to put semantic predicates in a set of rules, the goal is when I'm parsing a boolean expression beginning with the word 'init' then it has to go to only one alternative rule proposed who is boolliteral, the last alternative in boolExpression. And if it's an expression beginning with the word 'derive' then it could have access to all alternatives of boolExpression.

I know that I could make two type of boolExpression without semantic predicates like boolExpressionInit and boolExpressionDerive... But I would like to try with my idea if it's could work with a only one boolExpression with semantic predicates.

Here's my grammar

grammar TestExpression;

@header
{
package testexpressionparser;
}

@parser::members {
                    int vConstraintType;
                 }

/* SYNTAX RULES */
textInput       : initDefinition 
                | derDefinition ;

initDefinition  : t=INIT {vConstraintType = $t.type;} ':' boolExpression ;

derDefinition   : t=DERIVE {vConstraintType = $t.type;} ':' boolExpression ;

boolExpression  : {vConstraintType != INIT || vConstraintType == DERIVE}? boolExpression (boolOp|relOp) boolExpression 
                | {vConstraintType != INIT || vConstraintType == DERIVE}? NOT boolExpression
                | {vConstraintType != INIT || vConstraintType == DERIVE}? '(' boolExpression ')' 
                | {vConstraintType != INIT || vConstraintType == DERIVE}? attributeName
                | {vConstraintType != INIT || vConstraintType == DERIVE}? numliteral
                | {vConstraintType == INIT || vConstraintType == DERIVE}? boolliteral
                ;

boolOp          : OR | AND ;
relOp           : EQ | NEQ | GT | LT | GEQT | LEQT ;
attributeName   : WORD;
numliteral      : intliteral | decliteral;
intliteral      : INT ;
decliteral      : DEC ;
boolliteral     : BOOLEAN;


/* LEXICAL RULES */
INIT            : 'init';
DERIVE          : 'derive';
BOOLEAN         : 'true' | 'false' ;
BRACKETSTART    : '(' ;
BRACKETSTOP     : ')' ;
BRACESTART      : '{' ;
BRACESTOP       : '}' ;
EQ              : '=' ;
NEQ             : '!=' ;
NOT             : '!' ;
GT              : '>' ;
LT              : '<' ;
GEQT            : '>=' ;
LEQT            : '<=' ;
OR              : 'or' ;
AND             : 'and' ;
DEC             : [0-9]* '.' [0-9]* ;
INT             : ZERO | POSITIF;
ZERO            : '0';
POSITIF         : [1-9] [0-9]* ;
WORD            : [a-zA-Z] [_0-9a-zA-Z]* ;
WS              : (SPACE | NEWLINE)+ -> skip ;
SPACE           : [ \t] ;                       /* Space or tab */
NEWLINE         : '\r'? '\n' ;                  /* Carriage return and new line */

I except that the grammar would run successfully, but what i receive is : "error(119): TestExpression.g4::: The following sets of rules are mutually left-recursive [boolExpression]
1 error(s) BUILD FAIL"

1
The error message seems a tad misleading (claiming there's mutual left-recursion when the left recursion is actually direct), but it looks like ANTLR4's support for left recursion does not work with predicated. Can you explain what the predicate is for and when vConstraintType changes its value? Please also post an minimal reproducible example.sepp2k
@sepp2k, I re edited my post with a better explanation about what i would like try to do and with a minimal reproducible example as you requested.talohsa
Have you tried to move semantic predicates from the begin to something else? Also, I think it can be rewritten without semantic predicates at all.Ivan Kochurkin

1 Answers

0
votes

Apparently ANTLR4's support for (direct) left-recursion does not work when a predicate appears before a left-recursive rule invocation. So you can fix the error by moving the predicate after the first boolExpression in the left-recursive alternatives.

That said, it seems like the predicates aren't really necessary in the first place - at least not in the example you've shown us (or the one before your edit as far as I could tell). Since a boolExpression with the constraint type INIT can apparently only match boolLiteral, you can just change initDefinition as follows:

initDefinition  : t=INIT ':' boolLiteral ;

Then boolExpression will always have the constraint type DERIVE and no predicates are necessary anymore.

Generally, if you want to allow different alternatives in non-terminal x based on whether it was invoked by y or z, you should simply have multiple versions of x and then call one from y and the other from z. That's usually a lot less hassle than littering the code with actions and predicates.

Similarly it can also make sense to have a rule that matches more than it should and then detect illegal expressions in a later phase instead of trying to reject them at the syntax level. Specifically beginners often try to write grammars that only allow well-typed expressions (rejecting something like 1+true with a syntax error) and that never works out well.