yacc shift/reduce conflict: function call vs function definition

Question

I have the following lex definitions:

[a-zA-Z][a-zA-Z0-9_]*       return NAME;

\,              return COMMA;
\:              return COLON;
\;              return SEMICOLON;
\(              return OPAREN;
\)              return CPAREN;
\+              return PLUS;

And the following yacc production rules:

program: 
    | program statement;

arglist:
    OPAREN CPAREN
    | OPAREN expressionlist CPAREN;

trailed:
    NAME
    | trailed arglist;

expression:
    trailed
    | expression PLUS trailed;

expressionlist:
    expression
    | expressionlist COMMA expression;

statement:
    expression SEMICOLON
    |NAME arglist COLON expression SEMICOLON;

Everything compiles well if I comment out the last rule. With the last rule I get a conflict:

yacc: 1 shift/reduce conflict.

So I guess, yacc cannot decide whether to shift the next symbol onto the stack or to reduce the stack with a give rule.

Is my grammar ambiguous?
Shouldn't the decision between rule "trailed: trailed arglist" and "statement: NAME arglist COLON expression SEMICOLON" be without conflict, because the former never has a colon, while the latter always has?
Has this something to do with the size of the look-ahead buffer?
How can I fix this grammar to parse both "a (b) ();" and "a (b, c): b + c;" as valid statements?
How can I backtrack the conflict in a more detailed manner?

---- EDIT

Concerning MichaelMoser's answer:

Changing

arglist:
    OPAREN CPAREN
    | OPAREN expressionlist CPAREN;

expressionlist:
    expression
    | expressionlist COMMA expression;

to

arglist: OPAREN expressionlist CPAREN;

expressionlist:
| expressionlist COMMA expression; //this now allows for expression lists like ,a,b but NVM

as suggested doesn't help. The conflict still arises with the second rule for statement active, and once commenting that rule out, no conflict is given.

Chris Dodd Chris Dodd · Accepted Answer · 2014-02-02T21:55:16

As others have noted, the problem is that you need more than one token of lookahead to differentiate between a function definition and a function call. The problem with the grammar as written is that it needs to decide between reducing the rule trailed: NAME and shifting to match the rule statement: NAME arglist COLON expression SEMICOLON after seeing a NAME when the lookahead is OPAREN. But it can't decide until after it sees the arglist to see if there's a COLON after it or not (which is what distinguishes the two cases).

To fix this, you need to refactor the grammar so that there's no need to reduce anything only present on one alternative until you get to the COLON. With this grammar, you can do this by refactoring the trailed rule to always require at least one arglist, and making a NAME with no arglist a separate expression rule:

trailed:
    NAME arglist
    | trailed arglist;

expression:
    NAME
    | trailed
    | expression PLUS NAME
    | expression PLUS trailed;

Now when your get an input NAME OPAREN ... there's no need to reduce anything yet -- you just shift into the rule matching an arglist and after matching the arglist, you can see the next token and decide whether this is a function call or a function definition.

yacc shift/reduce conflict: function call vs function definition

3 Answers