Part of the answer is to look hard at the output file from bison -v
. For your first grammar, I got these extracts:
State 8 conflicts: 1 shift/reduce
State 9 conflicts: 1 shift/reduce
State 10 conflicts: 1 shift/reduce
State 11 conflicts: 1 shift/reduce
State 12 conflicts: 1 shift/reduce
Grammar
0 $accept: expr $end
1 expr: NUMBER
2 | expr '+' expr
3 | expr '-' expr
4 | expr '*' expr
5 | expr '/' expr
6 | expr expr
So there are 5 shift/reduce conflicts in the grammar. These are the less serious type of conflict; you can state that you expect them with %expect 5
in the grammar, if you're convinced that what the grammar is doing is correct.
state 0
0 $accept: . expr $end
NUMBER shift, and go to state 1
expr go to state 2
state 1
1 expr: NUMBER .
$default reduce using rule 1 (expr)
state 2
0 $accept: expr . $end
2 expr: expr . '+' expr
3 | expr . '-' expr
4 | expr . '*' expr
5 | expr . '/' expr
6 | expr . expr
$end shift, and go to state 3
'+' shift, and go to state 4
'-' shift, and go to state 5
'*' shift, and go to state 6
'/' shift, and go to state 7
NUMBER shift, and go to state 1
expr go to state 8
state 3
0 $accept: expr $end .
$default accept
state 4
2 expr: expr '+' . expr
NUMBER shift, and go to state 1
expr go to state 9
States 5, 6, 7 mimic state 4 but for the other operators. State 8 is the first of the states with a shift/reduce conflict. Remember the .
(dot) in the rules indicates where the parser is when it reaches this state.
state 8
2 expr: expr . '+' expr
3 | expr . '-' expr
4 | expr . '*' expr
5 | expr . '/' expr
6 | expr . expr
6 | expr expr .
NUMBER shift, and go to state 1
NUMBER [reduce using rule 6 (expr)]
$default reduce using rule 6 (expr)
expr go to state 8
state 9
2 expr: expr . '+' expr
2 | expr '+' expr .
3 | expr . '-' expr
4 | expr . '*' expr
5 | expr . '/' expr
6 | expr . expr
'*' shift, and go to state 6
'/' shift, and go to state 7
NUMBER shift, and go to state 1
NUMBER [reduce using rule 2 (expr)]
$default reduce using rule 2 (expr)
expr go to state 8
There are differences and similarities between these two states, but states 10, 11, 12 match state 9 except for the different point of ambiguity.
The trouble is that when the grammar sees:
NUMBER OP NUMBER NUMBER
it cannot tell whether to parse that as:
( NUMBER OP NUMBER ) NUMBER
expr expr
or as:
NUMBER OP ( NUMBER NUMBER )
expr OP expr
Given that it is a shift/reduce conflict in each case, it chooses to shift. If that's what you want, then add the %expect 5
and get on with life. If that's not what you want, then you need to rethink your grammar. What does a pair of adjacent numbers indicate, and are you sure you don't need some operator character (perhaps a comma or colon) to separate them?
I tried raising the precedence of the missing operator by using:
%left MISSING
after the other precedence declarations, and then using:
expr expr %prec MISSING
This did not change anything. Neither does making the precedence of MISSING very low by listing it before the other operators.
You get an inkling of the problems if you consider how an expression like this should be parsed:
NUMBER OP NUMBER NUMBER NUMBER OP NUMBER NUMBER OP NUMBER
Where the OP is the same in each appearance. My brain is hurting! So is bison
's!