5
votes

I believe I am having trouble understanding how shift reduce conflicts work. I understand that bison can look ahead by one, so I don't understand why I am having the issue.

In my language a List is defined as a set of numbers or lists between [ ]. For example [] [1] [1 2] [1 [2] 3] are all valid lists.

Here are the definitions that are causing problems

 value: num 
    | stringValue
    | list          
    ;

list: LEFTBRACE RIGHTBRACE  
    | LEFTBRACE list RIGHTBRACE 
    | num list          
    | RIGHTBRACE            
    ;

The conflict happens from the number, it doesn't know weather to shift by the list rule, or reduce by the value rule. I am confused because can't it check to see if a list is following the number?

Any incite on how I should proceed would be greatly appreciated.

2

2 Answers

6
votes

I think I'd define things differently, in a way that avoids the problem to start with, something like:

value: num
     | stringvalue
     | list
     ;

items:
     | items value
     ;

list: LEFTBRACE items RIGHTBRACE;

Edit: Separating lists of numbers from lists of strings can't be done cleanly unless you eliminate empty lists. The problem that arises is that you want to allow an empty list to be included in a list of numbers or a list of strings, but looking at the empty list itself doesn't let the parser decide which. For example:

[ [][][][][][][][] 1 ]

To figure out what kind of list this is, the parser would have to look ahead all the way to the 1 -- but an LALR(N) parser can only lookahead N symbols to make that decision. Yacc (and Byacc, Bison, etc.) only do LALR(1), so they can only look ahead one symbol. That leaves a few possibilities:

  1. eliminate the possibility of empty lists entirely
  2. Have the lexer treat an arbitrary number of consecutive empty lists as a single token
  3. use a parser generator that isn't limited to LALR(1) grammars

Inside of a yacc grammar, however, I don't think there's much you can do -- your grammar simply doesn't fit yacc's limitations.

1
votes

With a bottom up parser it is generally a good idea to avoid right recursion which is what you have in the starred line of the grammar below.

list: LEFTBRACE RIGHTBRACE  
    | LEFTBRACE list RIGHTBRACE 
  **| num list**          
    | RIGHTBRACE  

Instead have you thought about something like this?

value:value num
   | value string
   | value list
   | num
   | string
   | list

list: LEFTBRACE RIGHTBRACE
   | LEFTBRACE value RIGHTBRACE 

This way you have no right recursion and the nesting logic of the grammar is more simply expressed.