1
votes

I have written this YACC program for validate string w.r.t grammar {ckanbm: n ≠ m ∧ k,m,n > 0}. NL is for newline. And tokens are passed by lex which is already there. But this error is given. I think the production rules are ok, but I receive this message:

[waliullah.mtcs17@cseresearch wali1]$ yacc -d assign1.y
yacc: 2 shift/reduce conflicts, 1 reduce/reduce conflict.

Here is my grammar file:

%{
#include<stdio.h>
%}
%token NL A B C
%%
stmt : C stmt
 | X NL
 | Y NL
 ;
X    : A X B
 | X B
 | B
 ;
Y    : A Y B
 | A Y
 | A
 ;
%%
int yyerror(char *msg)
{
printf("Invalid String!\n");
exit(0);
}
1

1 Answers

2
votes

Your grammar is accurate and unambiguous, as far as I can see. But it is not finite lookahead, much less LR(1).

An LR(k) parser must be able to predict a reduction at the point in the input where the non-terminal is complete, based only on an examination of the next k tokens.

Now consider the input consisting of a large number (relative to k) of A tokens, followed by a B. If the input will eventually match Y, the parser must now reduce Y → A and then possibly one or more instances of Y → A Y, and the left-to-right nature of LR parsing means that these reductions must take place immediately.

But there is no way of knowing how many instances to reduce without seeing the entire rest of the input and counting the number of Bs. Moreover, it is entirely possible that the input matches X, in which case the B must be shifted since the reduction to Y is incorrect. Again, there is not enough information to make this decision until the end of the input is reached.

These dilemmas are revealed in the state table produced by passing the -v option to yacc/bison. For example, we can see:

State 1

    4 X: A . X B
    7 Y: A . Y B
    8  | A . Y
    9  | A .

    A  shift, and go to state 1
    B  shift, and go to state 2

    B         [reduce using rule 9 (Y)]
    $default  reduce using rule 9 (Y)

    X  go to state 7
    Y  go to state 8

which expresses the question of whether to reduce the last A to a Y on the assumption that there are more As than Bs, or to shift the first B in preparation for reducing it to an X, on the assumption that there are more Bs than As.

(The conflict is indicated by the presence of two different actions for lookahead B. The reduce action is contained within brackets ([reduce using rule 9 (Y)]) to indicate that bison has resolved the conflict by choosing to always shift. Of course, that is not always the correct resolution, so your parser will not correctly identify inputs which require the reduction action.)

This language does have an LR(1) grammar. The trick is to recognize the inner balanced sequence of As and Bs first, without forcing the parser to decide which type of sentence it has encountered. That decision can be made when a B is encountered which does not match any A, or when the end of the sentence is found with some As yet unmatched.

Here, AB is a balanced sequence of As and Bs. AAB one or more As followed by an AB; while ABB is an AB followed by one or more Bs.

stmt: tail NL
    | C tail NL
tail: AAB
    | ABB
AAB : A AB
    | A AAB
ABB : AB B
    | ABB B
AB  : /* empty */
    | A AB B

It is tempting to write

AAB : AA AB
AA  : A
    | A AA

but it won't work, because it requires the parser to decide whether an A is part of an AA before it ever sees a B. The way it is written above, the A is always shifted onto the stack, and a decision is made as the parser stack is unrolled.

On the other hand, it would have been possible to have written the production for ABB in this way (ABB: AB BB; BB: B | BB B), because by the time the parser reaches the first unmatched B all relevant decisions have already been made. I chose to do it as above for symmetry.