0
votes

I try to implement a function call in a custom grammar (plus a similar array-access operator):

expression
    :   ....OTHER EXPRESSION RULES....

    | expression PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE   {  }     %prec DOT
    | expression SQUARE_OPEN expressions SQUARE_CLOSE      {  }          %prec DOT
    ;

Here is my all my operator precedences:

%right ASSIGN ASSIGN_MOD ASSIGN_XOR ASSIGN_AND ASSIGN_STAR ASSIGN_MINUS ASSIGN_PLUS ASSIGN_OR ASSIGN_DIV ASSIGN_LSHIFT ASSIGN_RSHIFT
%right QUESTION COLON
%left OR
%left AND
%left BIN_OR
%left XOR
%left BIN_AND
%left NOT_EQUALS NOT_SAME EQUALS SAME
%left LESS LESS_EQUALS MORE MORE_EQUALS
%left LSHIFT RSHIFT
%left PLUS MINUS
%left PERCENT STAR SLASH
%right TILDE NOT DECREASE INCREASE
%left DOT

Please note that DOT is has the highest precedence. So, I try to give this to my function-call rules. Still, I get 74 shift/reduce warnings, that all follow this pattern:

State 25
15 expression: expression . PLUS expression
16           | expression . MINUS expression
17           | expression . NOT_EQUALS expression
18           | expression . NOT_SAME expression
19           | expression . PERCENT expression
20           | expression . ASSIGN_MOD expression
21           | expression . XOR expression
22           | expression . ASSIGN_XOR expression
23           | expression . BIN_AND expression
24           | expression . AND expression
25           | expression . ASSIGN_AND expression
26           | expression . STAR expression
27           | expression . ASSIGN_STAR expression
28           | expression . ASSIGN_MINUS expression
29           | expression . ASSIGN expression
30           | expression . EQUALS expression
31           | expression . SAME expression
32           | expression . ASSIGN_PLUS expression
33           | expression . BIN_OR expression
34           | expression . OR expression
35           | expression . ASSIGN_OR expression
36           | expression . SLASH expression
37           | expression . ASSIGN_DIV expression
38           | expression . DOT expression
39           | expression . LESS expression
40           | expression . LESS_EQUALS expression
41           | expression . LSHIFT expression
42           | expression . ASSIGN_LSHIFT expression
43           | expression . MORE expression
44           | expression . MORE_EQUALS expression
45           | expression . RSHIFT expression
46           | expression . ASSIGN_RSHIFT expression
48           | expression . PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE
49           | expression . SQUARE_OPEN expressions SQUARE_CLOSE
53           | DECREASE expression .
55           | expression . DECREASE
56           | expression . INCREASE

PARENTHESIS_OPEN  shift, and go to state 46
DECREASE          shift, and go to state 47
INCREASE          shift, and go to state 52
SQUARE_OPEN       shift, and go to state 54
DOT               shift, and go to state 61

PARENTHESIS_OPEN  [reduce using rule 53 (expression)]
SQUARE_OPEN       [reduce using rule 53 (expression)]
$default          reduce using rule 53 (expression)

State 46, that the conflicted shift indicates, says the following:

State 46

48 expression: expression PARENTHESIS_OPEN . expressions PARENTHESIS_CLOSE

MINUS             shift, and go to state 5
TILDE             shift, and go to state 6
NOT               shift, and go to state 7
PARENTHESIS_OPEN  shift, and go to state 8
DECREASE          shift, and go to state 9
INCREASE          shift, and go to state 10
INT               shift, and go to state 11
FLOAT             shift, and go to state 12
STRING            shift, and go to state 13
CHAR              shift, and go to state 14
ID                shift, and go to state 15

$default  reduce using rule 59 (expressions)

expression   go to state 87
expressions  go to state 88

I really do not get why bison chooses to reduce. Since I gave the function-call rule the highest possible precedence, bison should try to shift until it matches that one. Still, the prefix DECREASE operator looks like been the choice of bison, even if it has lower precedence.

Why bison does that? How can I say clearly to bison that the function-call rule should have the higher precedence, and thus, avoid the conflicts?

1

1 Answers

3
votes

The following is quoted from this answer:

Recall that a precedence relation is defined between a production and a terminal. It does not relate two terminals nor two productions (and so cannot be used to resolve reduce-reduce conflicts). The comparison between precedence of the production which could be reduced and the lookahead terminal determines whether a reduce or a shift will occur.

A %prec declaration (re-)defines the precedence of the reduction it is part of. In your case,

| expression PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE   {  }     %prec DOT
| expression SQUARE_OPEN expressions SQUARE_CLOSE      {  }          %prec DOT

declares that both of those reductions have the precedence of DOT, instead of PARENTHESIS_CLOSE and SQUARE_CLOSE [Note 1]. Since the latter two tokens don't appear in the %left / %right declarations, this is actually a definition of precedence, but it was unnecessary for two reasons:

  1. You could have just added PARENTHESIS_CLOSE and SQUARE_CLOSE to the appropriate precedence level, but more importantly

  2. These two reductions do not participate in any shift/reduce conflict.

You should make some attempt to understand (and hopefully agree with) my claim in item (2). As a place to start, consider State 25 which you've included in your question. In State 25, the only possible reduction is by rule 53 (expression: DECREASE expression). You can see that because that is the only item in the state which has the . at the right-hand edge. Only items which have the dot at the right-hand edge can be reduced (since the dot being at the right-hand edge indicates that the production corresponding to the item might be complete in this state.) And, indeed, you can see the shift/reduce conflicts reported for this state:

PARENTHESIS_OPEN  shift, and go to state 46
PARENTHESIS_OPEN  [reduce using rule 53 (expression)]

SQUARE_OPEN       shift, and go to state 54
SQUARE_OPEN       [reduce using rule 53 (expression)]

Both of those conflicts involve a possible reduction using rule 53.

So in State 25 if a ( is the lookahead character, the grammar would allow either

  • a shift of the (, leading to a state with the item expression: expression PARENTHESIS_OPEN . expressions PARENTHESIS_CLOSE (note how the dot has moved over the PARENTHESIS_OPEN token).

  • or a reduction of the rule expression: DECREASE expression.

Bison resolves this conflict by comparing the precedence of the reduction (DECREASE) with the precedence of the look-ahead token (PARENTHESIS_OPEN). PARENTHESIS_OPEN does not appear in any precedence level, so Bison falls back on its default, which is to prefer shifting.

Evidently, changing the precedence of the reduction expression: expression PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE has no impact on the resolution of this conflict, because that reduction is not relevant to this conflict.

Now, my claim is that that reduction is not relevant to any conflict in the grammar. That might seem a slightly outlandish claim, since I cannot see much of the grammar, and indeed I could be wrong. In theory, there could be some other state in the table which included the item:

expression: expression PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE .

and also included some item in which a shift was possible, such as

some_non_terminal: expression PARENTHESIS_OPEN expressions PARENTHESIS_CLOSE . something

That just doesn't seem likely to me.

Normally, postfix operator reductions (and function call and array indexing are conceptually postfix operators) never participate in shift-reduce conflicts precisely because there is practically never a possible shift after a postfix operator. If there were such a shift, the operator would be infix, not postfix. You could imagine a grammar in which an operator symbol could be either an infix and a postfix operator, by analogy with operators like - which could be either infix or prefix. But it turns out that the situation is not symmetric for reasons beyond the scope of this answer. [Note 2]

To get back to the original question: We've seen that the shift/reduce conflict is between the reduction expression: DECREASE expression (in this case) and the terminals PARENTHESIS_OPEN and SQUARE_OPEN, and it cannot be resolved because PARENTHESIS_OPEN and SQUARE_OPEN are not listed in your precedence levels. So the solution is to list them:

/* ... */
%left PERCENT STAR SLASH
%precedence TILDE NOT DECREASE INCREASE
%precedence PARENTHESIS_OPEN SQUARE_OPEN

Note that I changed the last %left and %right to %precedence, which is a bison extension which allows you to define a precedence level for operators for which associativity is meaningless. I did that because I think it is clearer. [Note 3]

Notes

  1. There is really very little good to be said for using PARENTHESIS_OPEN rather than the much simpler and more readable '('. Yacc and bison allow single character tokens to be single-quoted like that precisely to allow for the readability of

    expression:  expression '(' expressions ')'
    expression:  expression '[' expressions ']'
    

    That also simplifies your (f)lex scanner, because a single fallback rule can handle all four of those tokens, and all the other single character tokens, including the ones you have yet to add to your grammar:

          /* Put this rule at the end of your ruleset */ 
    .    { return *yytext;}
    
  2. For example, suppose that ! could be postfix or infix, and consider the expression a!-b*4. The ambiguity here ((a!)-b or a!(-b)) is compounded by the fact that the binary bang, minus and multiply operators also have active precedence rules.

  3. Unary operators, whether prefix or postfix, don't have associativity because associativity only applies to binary operators. Associativity is why a + b + c is (a + b) + c while a = b = c is a = (b = c). In contrast, there is only one way to parse --a or !!a (or combinations thereof). (This is also clear if you think about how precedence relations affect shift-reduce conflict resolutions. What is the conflict which would require a unary reduction ('!' expression .) to be compared with a ! lookahead symbol? In the case of dual-purposed operators like '-', we need to change the precedence of the reduction to a pseudo-terminal (%prec UMINUS), after which it's clear that associativity cannot apply because UMINUS cannot possibly be a lookahead symbol.