0
votes

I am currently implementing the part of the Decaf (programming language) grammar. Here is the relevant snippet of bison code:

type:
  INT
  | ID
  | type LS RS
  ;

local_var_decl:
  type ID SEMICOLON
  ;

name:
  THIS
  | ID
  | name DOT ID
  | name LS expression RS
  ;

Nevertheless, as soon as I started working on name production rule, my parser gives the reduce-reduce warning.

Here what it's inside the .output file (generated by bison):

State 84

   23 type: ID .
   61 name: ID .

    ID        reduce using rule 23 (type)
    LS        reduce using rule 23 (type)
    LS        [reduce using rule 61 (name)]
    $default  reduce using rule 61 (name)

So, if we give the following input { abc[1] = abc; }, it says that syntax error, unexpected NUMBER, expected RS. NUMBER comes here from expression rule (basically, how it must have parsed it), though it tries to parse it through local_var_decl rule.

What do you think should be changed in order to solve this problem? Spent around 2 hours, tried different stuff, did not work.

Thank you!!

PS. Here is the link to the full .y source code

1

1 Answers

1
votes

This is a specific instance of a common problem where the parser is being forced to make a decision before it has enough information. In some cases, such as this one, the information needed is not far away, and it would be sufficient to increase the lookahead, if that were possible. (Unfortunately, few parser generators produce LR(k) parsers with k > 1, and bison is no exception.) The usual solution is to simply allow the parse to continue without having to decide. Another solution, with bison (but only in C mode) is to ask for a %glr-parser, which is much more flexible about when reductions need to be resolved at the cost of additional processing time.

In this case, the context allows either a type or a name, both of which can start with an ID followed by a [ (LS). In the case of a name, the [ must be followed by a number; in the case of a type, the [ must be followed by a ]. So if we could see the second token after the ID, we could immediately decide.

But we can only see one token ahead, which is the ]. And the grammar insists that we be able to make an immediate decision because in one case we must reduce the ID to a name and in the other case, to a type. So we have a reduce-reduce conflict, which bison resolves by always using whichever reduction comes first in the grammar file.

One solution is to avoid forcing this choice, at the cost of duplicating productions. For example:

type_other:
  INT
  | ID LS RS
  | type_other LS RS
  ;
type: ID
  | type_other
  ;

name_other:
  THIS
  | ID LS expression RS
  | name_other DOT ID
  | name_other LS expression RS
  ;
name: ID
  | name_other
  ;