Why does right recursive grammar require bigger parser from Yacc?

Question

From the Yacc introduction by Stephen C. Johnson:

With right recursive rules, such as
    seq     :       item
            |       item  seq
            ;
the parser would be a bit bigger, and the items would be seen, and reduced, from right to left. More seriously, an internal stack in the parser would be in danger of overflowing if a very long sequence were read. Thus, the user should use left recursion wherever reasonable.

I know Yacc generates LR parsers, so I tried parsing some simple right recursive grammar by hand. I couldn't see the problem so far. Anyone can give an example to demonstrate these problems?

Jonathan Leffler Jonathan Leffler · Accepted Answer · 2013-10-22T19:20:19

The parser size is not a serious issue (most of the time).

The runtime stack size can be an issue. The trouble is that the right recursive rules mean that stack can't be reduced until the parsers reached the end of the sequence, whereas with the left recursive rule, each time the grammar encounters an seq item, it can reduce the number of items on the stack.

Classically, the stack for tokens was fixed and limited in size. Consequently, a right recursive rule such as one to handle:

IF <cond> THEN
    <stmt-list>
ELSIF <cond> THEN
    <stmt-list>
ELSIF <cond> THEN
    <stmt-list>
ELSE
    <stmt-list>
ENDIF

would limit the number of terms in the chain of ELIF clauses that the grammar could accept.

Hypothetical grammar:

if_stmt: if_clause opt_elif_clause_list opt_else_clause ENDIF
    ;
if_clause: IF condition THEN stmt_list
    ;
opt_elif_clause_list: /* Nothing */
    | elif_clause opt_elif_clause_list  /* RR */
    ;
elif_clause: ELIF condition THEN stmt_list 
    ;
opt_else_clause: /* Nothing */
    |   ELSE stmt_list
    ;
stmt_list: stmt
    | stmt_list stmt       /* LR */
    ;

I seem to remember doing some testing of this a long time ago now (a decade or more ago), and the Yacc that I was using at the time, in conjunction with the language grammar, similar to that above, meant that after about 300 ELIF clauses, the parser stopped (I think it stopped under control, recognizing it had run out of space, rather than crashing oblivious of the space exhaustion).

Why does right recursive grammar require bigger parser from Yacc?

2 Answers