I have the following lex definitions:
[a-zA-Z][a-zA-Z0-9_]* return NAME;
\, return COMMA;
\: return COLON;
\; return SEMICOLON;
\( return OPAREN;
\) return CPAREN;
\+ return PLUS;
And the following yacc production rules:
program:
| program statement;
arglist:
OPAREN CPAREN
| OPAREN expressionlist CPAREN;
trailed:
NAME
| trailed arglist;
expression:
trailed
| expression PLUS trailed;
expressionlist:
expression
| expressionlist COMMA expression;
statement:
expression SEMICOLON
|NAME arglist COLON expression SEMICOLON;
Everything compiles well if I comment out the last rule. With the last rule I get a conflict:
yacc: 1 shift/reduce conflict.
So I guess, yacc cannot decide whether to shift the next symbol onto the stack or to reduce the stack with a give rule.
Is my grammar ambiguous?
Shouldn't the decision between rule "trailed: trailed arglist" and "statement: NAME arglist COLON expression SEMICOLON" be without conflict, because the former never has a colon, while the latter always has?
Has this something to do with the size of the look-ahead buffer?
How can I fix this grammar to parse both "a (b) ();" and "a (b, c): b + c;" as valid statements?
How can I backtrack the conflict in a more detailed manner?
---- EDIT
Concerning MichaelMoser's answer:
Changing
arglist:
OPAREN CPAREN
| OPAREN expressionlist CPAREN;
expressionlist:
expression
| expressionlist COMMA expression;
to
arglist: OPAREN expressionlist CPAREN;
expressionlist:
| expressionlist COMMA expression; //this now allows for expression lists like ,a,b but NVM
as suggested doesn't help. The conflict still arises with the second rule for statement
active, and once commenting that rule out, no conflict is given.