1
votes

From the Yacc introduction by Stephen C. Johnson:

With right recursive rules, such as

    seq     :       item
            |       item  seq
            ;

the parser would be a bit bigger, and the items would be seen, and reduced, from right to left. More seriously, an internal stack in the parser would be in danger of overflowing if a very long sequence were read. Thus, the user should use left recursion wherever reasonable.

I know Yacc generates LR parsers, so I tried parsing some simple right recursive grammar by hand. I couldn't see the problem so far. Anyone can give an example to demonstrate these problems?

2

2 Answers

2
votes

The parser size is not a serious issue (most of the time).

The runtime stack size can be an issue. The trouble is that the right recursive rules mean that stack can't be reduced until the parsers reached the end of the sequence, whereas with the left recursive rule, each time the grammar encounters an seq item, it can reduce the number of items on the stack.

Classically, the stack for tokens was fixed and limited in size. Consequently, a right recursive rule such as one to handle:

IF <cond> THEN
    <stmt-list>
ELSIF <cond> THEN
    <stmt-list>
ELSIF <cond> THEN
    <stmt-list>
ELSE
    <stmt-list>
ENDIF

would limit the number of terms in the chain of ELIF clauses that the grammar could accept.

Hypothetical grammar:

if_stmt: if_clause opt_elif_clause_list opt_else_clause ENDIF
    ;
if_clause: IF condition THEN stmt_list
    ;
opt_elif_clause_list: /* Nothing */
    | elif_clause opt_elif_clause_list  /* RR */
    ;
elif_clause: ELIF condition THEN stmt_list 
    ;
opt_else_clause: /* Nothing */
    |   ELSE stmt_list
    ;
stmt_list: stmt
    | stmt_list stmt       /* LR */
    ;

I seem to remember doing some testing of this a long time ago now (a decade or more ago), and the Yacc that I was using at the time, in conjunction with the language grammar, similar to that above, meant that after about 300 ELIF clauses, the parser stopped (I think it stopped under control, recognizing it had run out of space, rather than crashing oblivious of the space exhaustion).

2
votes

I'm not at all sure why he says the right recursive parser will be bigger -- in general it will require one fewer state in its state machine (which should if anything make it smaller), but the real issue is that the right recursive grammar requires unbounded stack space, while the left recursive grammar requires constant space (O(n) space vs O(1) space).

Now O(n) vs O(1) may sound like a big deal, but depending on what you are doing, it may be unimportant. In particular, if you're reading your entire input into memory to process it, that's O(n) space that completely overwhelms the O(n) vs O(1) distinction for right vs left recursion. If you're using a particularly old version of yacc that still has a fixed parser stack, it may be a problem, but recent versions of yacc (Berkeley yacc, bison) automatically expand their parse stack on demand, so the only limit is available memory.