0
votes

I get 2 shift/reduce errors when trying to compile my grammar:

program : declaration_list ;
declaration_list : declaration_list declaration | declaration ;
declaration : var_declaration | fun_declaration ; 

var_declaration: type_specifier var_decl_list ;
scoped_var_declaration: scoped_type_specifier var_decl_list;
var_decl_list : var_decl_list "," var_decl_initialize | var_decl_initialize ;
var_decl_initialize : var_decl_id ;
var_decl_id : ID | ID "[" INT_NUM "]" ;
scoped_type_specifier : type_specifier;
type_specifier: INT | CHAR ;
fun_declaration: type_specifier ID "(" params ")" statement | ID "(" params ")" statement ;
params : | param_list ;
param_list: param_list ";" param_type_list | param_type_list ;
param_type_list : type_specifier param_id_list ; 
param_id_list : param_id_list "," param_id | param_id ; 
param_id : ID | ID "[" "]" ;


statement: expression_stmt | compound_stmt | selection_stmt | iteration_stmt | return_stmt | break_stmt ;

compound_stmt : "{" local_declaration statement_list "}" ;
local_declaration : | local_declaration scoped_var_declaration ;
statement_list : | statement_list statement ;
expression_stmt : expression ";" | ";" ;
selection_stmt : "if" "(" simple_expression ")" statement | "if" "(" simple_expression ")" statement "else" statement ;
iteration_stmt: "while" "(" simple_expression ")" statement ;
return_stmt : "return" ";" | "return" expression ;
break_stmt : "break" ;


expression : mutable ASSIGN expression | mutable EINC expression | mutable EDEC expression | mutable INC | mutable DEC | simple_expression ;

simple_expression: simple_expression OR and_expression | and_expression ; 

and_expression: and_expression AND unary_rel_expression | unary_rel_expression ;

unary_rel_expression: "!" unary_rel_expression | rel_expression ;

rel_expression : sum_expression relop sum_expression | sum_expression ;

relop: "<" | ">" | LE | GE | COMP_OP | NE ;

sum_expression : sum_expression sumop term | term ; 

sumop : ADD | SUB ;

term: term mulop unary_expression | unary_expression ;

mulop: MULT | DIV | REM ;

unary_expression : unaryop unary_expression | factor ;

unaryop : "-" | "*" ;

factor : immutable | mutable ;

mutable : ID | ID "[" expression "]" ;

immutable : "(" expression ")" | call | constant ; 

call: ID "(" args ")" ;

args:  | arg_list;

arg_list : arg_list","expression | expression ;

constant : INT_NUM | STRINGCONST | CHARCONST | 'true' | 'false' ;

I get 2 shift/reduce at

State 38 conflicts: 1 shift/reduce State 139 conflicts: 1 shift/reduce

The specific states are :

state 38

   81 mutable: ID .  [$end, ADD, SUB, DIV, MULT, COMP_OP, INT, CHAR, CHARCONST, STRINGCONST, GE, LE, NE, EINC, EDEC, INC, DEC, AND, OR, REM, ASSIGN, INT_NUM, ID, ",", "]", "(", ")", ";", "{", "}", "if", "else", "while", "return", "break", "!", "<", ">", "-", "*", 't', 'f']
   82        | ID . "[" expression "]"
   86 call: ID . "(" args ")"

    "["  shift, and go to state 77
    "("  shift, and go to state 78

    "("       [reduce using rule 81 (mutable)]
    $default  reduce using rule 81 (mutable)




state 139

   40 selection_stmt: "if" "(" simple_expression ")" statement .  [$end, INT, CHAR, CHARCONST, STRINGCONST, INT_NUM, ID, "(", ";", "{", "}", "if", "else", "while", "return", "break", "!", "-", "*", 't', 'f']
   41               | "if" "(" simple_expression ")" statement . "else" statement

    "else"  shift, and go to state 141

    "else"    [reduce using rule 40 (selection_stmt)]
    $default  reduce using rule 40 (selection_stmt)

Could someone please explain how to fix this error, I know it has something to do with precedence.

2
Try to reduce your grammar by removing rules until the problem disappears. You have a non-trivial grammar that is somewhat hard to debug as a whole.dhke

2 Answers

4
votes

The shift-reduce conflict in state 38 is a result of the following production:

return_stmt : "return" ";"
            | "return" expression
            ;

Note that this production allows a statement to end with an expression. Other than this production, statements must end with either ; or }. If a statement can end with an expression, then the grammar becomes ambiguous. Consider, for example:

return a
(1+1);

which could be a return_stmt (returning the value of a) followed by an expression_stmt, or could be a single return_stmt, returning the value of the function call a(1+1).

Most likely, this was just a typo in your grammar, and you intended:

 return_stmt : "return" ";"
            | "return" expression ";"
            ;

(But see note below about quotes.)


The shift-reduce conflict at state 139 is the classic C-style if conflict, which is the result of an ambiguity in the language:

if (c1) if(c2) s1; else s2;

Which if does the else apply to? Of course, we all expect it to be the inner if, which is the only sensible answer (because when we're looking at the else we have no idea whether or not there is another else following), but the grammar itself allows either.

bison will do the right thing in this case, as you can see. (It chooses to shift the else, which means that it will not reduce the inner if without the else.) Consequently, the simplest possible solution is to ignore this particular shift-reduce conflict. (See the %expect directive.)

If that doesn't appeal to you, you have two alternatives: one is to use precedence declarations to explicitly give priority to the shift and the other is to be more explicit in the grammar, in order to make the correct parse mandatory. Both of these possibilities are well explored in the two answers to this question, featuring a remarkably similar grammar. (To see the precedence solution, you have to follow the link in Akim's answer.)


If you are using bison, you should be aware that "word" and 'x' are completely different syntaxes. The first one uses a token which has been given a human readable name, usually using a declaration like this:

%token TOKEN_RETURN "return"

You need the all-caps enumeration name for your scanner, so that you can write (assuming your scanner is written in flex):

"return" { return TOKEN_RETURN; }

but the grammar is much more readable with the quoted strings. (For example, what does EINC mean in your language? I have no idea...)

If you don't declare the enumeration name for the token, bison won't complain, and it will still assign a code number for the token, but it will be much more difficult for you to figure out what the code number is.

On the other hand, '(' represents a single character token, whose code number is the ascii code for the character. (So it corresponds to the usage of ( in C.) In this case, the code number is known, and you could write in flex:

"(" { return '('; }

Although you would be better off doing this with the fall back rule:

 . { return yytext[0]; }

In short, you probably want to change all the uses of, for example, ";" to ';', and also change 'true' and 'false' to "true" and "false". And you might also want to replace some of the other keyword tokens like "LE" and "INC" to "<=" and "++", but also adding appropriate %token declarations for all double-quoted tokens.

-2
votes