0
votes

Exercise

I'm trying to implement the grammar of this exercise using lex & yacc, but it throws this conflict: warning: 5 shift/reduce conflicts [-Wconflicts-sr] and takes every string as invalid, I think it has to be something with the precedence: It is supposed to accept strings like: ba#ababadada#bad#dabbada and reject: dad#ad#abaadad#badadbaad This is what I've tried:

%{
#include <stdio.h>
int flag=0;
%}
%{
void yyerror();
%}

%token A B D NL
%%
str1        :   Oracion NL      { printf("\n Sequence accepted");return(0);}
        ;
Oracion     :   Oracion '#' Parrafo     { }
        |   Parrafo         { }
        ;
Parrafo     :   Silaba Parrafo Silaba           { }
        |   Silaba              { }
        ;
Silaba      :   Explosion Alto      { }
        |   A Explosion             { }
        |   A Alto          { }
        |   Explosion           { }
        ;
Explosion   :   Alto A              { }
        ;
Alto        :   B               { }
        |   D               { }
        ;   
%%
void main()
{
    printf("\n write a sequence\n");
    yyparse();
    if(flag == 0)
        printf("\n Valid sequence");
}
void yyerror()
{
    printf("\n Invalid Sequence\n\n");
    flag=1;
}
1
Note: return (0) in the action str1: Oracion NL { printf("\n Sequence accepted");return(0);} is not correct. Did your instructor tell you to do that? Or did you get it from some on-line tutorial (which probably indicates that the tutorial isn't very good).rici
Please don't delete answered questions. It shows huge disrespect for the people who contributed their time freely to answer.rici
Yesterday someone with the same curious use of return 0 and a variable called flag deleted a question which I was in the process of answering. Perhaps they were a classmate of yours, I don't know. But it's super annoying. Anyway, I'd really like to know where that idiom comes from.rici

1 Answers

0
votes

That grammar allows four syllables (ignoring the non-syntactic distinction between b and d):

ba
bab
ab
aba

A word is a sequence containing an odd number of syllables. Since syllable boundaries are not marked in any way, the division into syllables is ambiguous, which is why bison reports conflicts. For example, babab could be parsed as

ba-bab  (Explosión / Explosión Alto)
bab-ab  (Explosión Alto / "a" Alto)

In a few cases, the fact that words must consist of an odd number of syllables can disambiguate a parse. For example, bababa can only be decomposed as ba-ba-ba (Explosión / Explosión / Explosión), because bab-aba (Explosión Alto / "a" Explosión) has only two syllables, an even number. In particular, this means that the greedy solution -- always using the longest possible syllable -- will sometimes produce an incorrect parse. Moreover, you cannot guarantee the correct parse without arbitrary lookahead, making an LR parser inappropriate. Consider the following two words:

abababbabbabbabbabbabbab
abababbabbabbabbabbabbabbab

A double consonant is an unambiguous syllable break, so the only ambiguous break is in the first six characters, which -- as above -- could be two or three syllables. Looking at the entire word, it will be clear which of these choices leads to a word with an odd number of syllables. But until the entire word has been seen, the parser cannot know. In particular, after it sees the initial ab, it must decide whether to resolve that as a syllable of the form a Alto or shift the following a in order to resolve it as a Explosión. But any fixed lookahead limit won't be sufficient to make this decision in all cases.

I suppose that the exercise is intended to help you understand this ambiguity (unless there is more to the exercise than what you showed in the question). I'm not sure what else you're being asked to do. If you want to use bison in order to actually produce parse trees, you'll have to deal with the ambiguity in some way, which probably means using a GLR parser.