3
votes

I've got a problem in my Bison grammar. I've got a pair of shift/reduces which are fine, and six reduce/reduces. The issue is that I don't understand how the reduce/reduce conflicts come about, since the parser should know which to choose from tokens prior.

%token STRING_LITERAL
%token INTEGER
%token FLOAT
%token CHARACTER
%token PTR_OP INC_OP DEC_OP LEFT_OP RIGHT_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN LEFT_ASSIGN RIGHT_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN STATIC CATCH DOUBLE_COLON ELLIPSIS FUNCTION VAR
%token SIZEOF
%token GOTO
%token AUTO  
%token THIS VAR_ASSIGN
%token NAMESPACE
%token TRY
%token TYPE
%token DECLTYPE
%token PUBLIC
%token PRIVATE
%token PROTECTED
%token USING
%token THROW
%token FRIEND
%token COMPILETIME
%token RUNTIME
%token VIRTUAL
%token ABSTRACT 
%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR CONTINUE BREAK RETURN
%%

global_scope_definition
    : namespace_definition
    | function_definition
    | variable_definition
    | using_definition
    | type_definition;

global_scope_definitions
    : global_scope_definition
    | global_scope_definitions global_scope_definition

program
    : global_scope_definitions;

type_expression
    : expression

variable_assignment
    : VAR_ASSIGN;

name_or_qualified_name
    : IDENTIFIER
    | name_or_qualified_name '.' IDENTIFIER;

namespace_definition
    : NAMESPACE name_or_qualified_name '{' namespace_scope_definitions '}';

accessibility_definition
    : PUBLIC ':'
    | PRIVATE ':'
    | PROTECTED ':'
    | FRIEND ':';

using_definition
    : USING IDENTIFIER '=' name_or_qualified_name ';'
    | USING name_or_qualified_name ';';

type_definition
    : TYPE IDENTIFIER type_literal;

namespace_scope_definition
    : accessibility_definition
    | global_scope_definition;

namespace_scope_definitions
    : namespace_scope_definition
    | namespace_scope_definitions namespace_scope_definition;

accessibility_modifier
    : PUBLIC
    | PROTECTED
    | PRIVATE
    | FRIEND;

accessibility_block
    : phase_block
    | accessibility_modifier phase_block;

phase_modifier
    : COMPILETIME
    | RUNTIME;

phase_block
    : definition_block
    | phase_modifier definition_block;

definition_block
    : default_definition_block
    | STATIC static_definition_block
    | VIRTUAL virtual_definition_block
    | ABSTRACT abstract_definition_block;

static_definition_block
    : '{' static_definitions '}';

static_definitions
    : static_definition
    | static_definitions static_definition;

static_definition
    : variable_definition
    | function_definition;


abstract_definition_block
    : '{' abstract_definitions '}';

abstract_definitions
    : abstract_definition
    | abstract_definitions abstract_definition;

abstract_definition
    : function_definition; 

virtual_definition_block
    : '{' virtual_definitions '}';

virtual_definitions
    : virtual_definition
    | virtual_definitions virtual_definition;

virtual_definition
    : function_definition;


default_definition_block
    : '{' default_definitions '}';

default_definitions
    : default_definition
    | default_definitions default_definition;

default_definition
    : variable_definition
    | function_definition
    | constructor_definition
    | destructor_definition
    | type_definition;

type_scope_definition
    : using_definition
    | default_definition
    | accessibility_block;

type_scope_definitions
    : type_scope_definition
    | type_scope_definitions type_scope_definition;

destructor_definition
    : '~' TYPE '(' ')' compound_statement;

constructor_definition
    : TYPE function_definition_arguments statements_and_inits;

statements_and_inits
    : inits compound_statement
    | compound_statement;

init
    : ':' IDENTIFIER function_call_expression;

inits
    : init
    | inits init;

function_definition_arguments
    : '(' ')'
    | '(' function_argument_list ')';

function_definition
    : type_expression IDENTIFIER function_definition_arguments compound_statement
    | type_expression IDENTIFIER function_definition_arguments function_definition_arguments compound_statement;

function_argument_definition
    : IDENTIFIER
    | type_expression IDENTIFIER
    | IDENTIFIER variable_assignment expression
    | type_expression IDENTIFIER variable_assignment expression
    | IDENTIFIER variable_assignment '{' expressions '}'
    | type_expression IDENTIFIER variable_assignment '{' expressions '}';

function_argument_list
    : function_argument_definition
    | function_argument_list ',' function_argument_definition;

static_variable_definition
    : STATIC variable_definition
    | FRIEND variable_definition
    | STATIC FRIEND variable_definition
    | variable_definition;

variable_definition
    : IDENTIFIER variable_assignment expression ';'
    | type_expression IDENTIFIER variable_assignment expression ';'
    | type_expression IDENTIFIER ';'
    | type_expression IDENTIFIER function_call_expression ';';

base_class_list
    : ':' type_expression
    | base_class_list ',' type_expression;

type_literal
    : base_class_list '{' type_scope_definitions '}'
    | '{' type_scope_definitions '}'
    | base_class_list '{' '}'
    | '{' '}';

literal_expression
    : INTEGER
    | FLOAT
    | CHARACTER
    | STRING_LITERAL
    | AUTO
    | THIS
    | TYPE type_literal;

primary_expression    
    : literal_expression
    | '(' expression ')'
    | IDENTIFIER;

expression
    : variadic_expression;

variadic_expression
    : assignment_expression
    | assignment_expression ELLIPSIS;

assignment_operator
    : '='
    | MUL_ASSIGN
    | DIV_ASSIGN
    | MOD_ASSIGN
    | ADD_ASSIGN
    | SUB_ASSIGN
    | LEFT_ASSIGN
    | RIGHT_ASSIGN
    | AND_ASSIGN
    | XOR_ASSIGN
    | OR_ASSIGN;

assignment_expression
    : logical_or_expression
    | unary_expression assignment_operator assignment_expression;

logical_or_expression
    : logical_and_expression
    | logical_or_expression OR_OP logical_and_expression;

logical_and_expression
    : inclusive_or_expression
    | logical_and_expression AND_OP inclusive_or_expression;

inclusive_or_expression
    : exclusive_or_expression
    | inclusive_or_expression '|' exclusive_or_expression;

exclusive_or_expression
    : and_expression
    | exclusive_or_expression '^' and_expression;

and_expression
    : equality_expression
    | and_expression '&' equality_expression;

equality_expression
    : relational_expression
    | equality_expression EQ_OP relational_expression
    | equality_expression NE_OP relational_expression;

comparison_operator
    : '<'
    | '>'
    | LE_OP 
    | GE_OP;

relational_expression
    : shift_expression
    | relational_expression comparison_operator shift_expression;

shift_operator
    : LEFT_OP
    | RIGHT_OP;

shift_expression
    : additive_expression
    | shift_expression shift_operator additive_expression;

additive_operator
    : '+'
    | '-';

additive_expression
    : multiplicative_expression
    | additive_expression additive_operator multiplicative_expression;

multiplicative_operator
    : '*'
    | '/'
    | '%';

multiplicative_expression
    : unary_expression
    | multiplicative_expression multiplicative_operator unary_expression;

lambda_expression
    : '[' capture_list ']' function_argument_list compound_statement
    | '[' capture_list ']' compound_statement;
    | '[' ']' function_argument_list compound_statement
    | '[' ']' compound_statement;


default_capture
    : '&' | '=' ;

capture_list
    : default_capture comma_capture_list
    | comma_capture_list;

comma_capture_list
    : variable_capture
    | comma_capture_list ',' variable_capture;

variable_capture
    : '&' IDENTIFIER
    | '=' IDENTIFIER
    | AND_OP IDENTIFIER;

unary_operator
    : '&'
    | '*'
    | '+'
    | '-'
    | '~'
    | '!'
    | INC_OP
    | DEC_OP;

unary_expression
    : unary_operator unary_expression
    | SIZEOF '(' expression ')'
    | DECLTYPE '(' expression ')'
    | lambda_expression
    | postfix_expression;

postfix_expression
    : primary_expression  { $$ = $1; }
    | postfix_expression '[' expression ']'
    | postfix_expression function_call_expression
    | postfix_expression '.' IDENTIFIER
    | postfix_expression PTR_OP IDENTIFIER
    | postfix_expression INC_OP
    | postfix_expression DEC_OP
    | postfix_expression FRIEND;

expressions
    : expression
    | expressions ',' expression;

function_argument
    : expression
    | IDENTIFIER variable_assignment '{' expressions '}'
    | IDENTIFIER variable_assignment expression;

function_arguments
    : function_argument
    | function_arguments ',' function_argument;

function_call_expression
    : '(' function_arguments ')'
    | '(' ')';

initializer_statement
    : expression
    | IDENTIFIER variable_assignment expression
    | type_expression IDENTIFIER variable_assignment expression;

destructor_statement
    : expression '~' TYPE '(' ')' ';';

return_statement
    : RETURN expression ';'
    | RETURN ';';

try_statement
    : TRY compound_statement catch_statements;

catch_statement
    : CATCH '(' type_expression IDENTIFIER ')' compound_statement;

catch_statements
    : catch_statement
    | catch_statements catch_statement
    | CATCH '(' ELLIPSIS ')' compound_statement
    | catch_statements CATCH '(' ELLIPSIS ')' compound_statement;

for_statement_initializer
    : initializer_statement ';'
    | ';';

for_statement_condition
    : expression ';'
    | ';';

for_statement_repeat
    : expression
    | ;

for_statement
    : FOR '(' for_statement_initializer for_statement_condition for_statement_repeat ')' statement;

while_statement
    : WHILE '(' initializer_statement ')' statement;

do_while_statement
    : DO statement WHILE '(' expression ')';

switch_statement
    : SWITCH '(' initializer_statement ')' '{' case_statements '}';

default_statement
    : DEFAULT ':' statements;

case_statement
    : CASE expression DOUBLE_COLON statements;

case_statements
    : case_statement 
    | case_statements case_statement { $1.push_back($2); $$ = std::move($1); }
    | case_statements default_statement { $1.push_back($2); $$ = std::move($1); };

if_statement
    : IF '(' initializer_statement ')' statement
    | IF '(' initializer_statement ')' statement ELSE statement;

continue_statement
    : CONTINUE ';';

break_statement
    : BREAK ';';

label_statement
    : IDENTIFIER ':';

goto_statement
    : GOTO IDENTIFIER ';';

throw_statement
    : THROW ';'
    | THROW expression ';';

runtime_statement
    : RUNTIME compound_statement;

compiletime_statement
    : COMPILETIME compound_statement;

statement
    : compound_statement
    | return_statement
    | try_statement
    | expression ';'
    | static_variable_definition
    | for_statement
    | while_statement
    | do_while_statement
    | switch_statement
    | if_statement
    | continue_statement
    | break_statement
    | goto_statement
    | label_statement
    | using_definition 
    | throw_statement
    | compiletime_statement
    | runtime_statement
    | destructor_statement ;

statements
    : statement
    | statements statement;

compound_statement
    : '{' '}'
    | '{' statements '}';

%%

This is my grammar. Bison takes issue with supposed ambiguity between function_argument_definition and primary_expression, and between function_argument_definition and function_argument. However, I'm pretty sure that it should already know which to pick by the time it encounters any such thing. How can I resolve these ambiguities?

1

1 Answers

3
votes

Consider the rules

function_definition:
    type_expression IDENTIFIER function_definition_arguments compound_statement

variable_definition:
    type_expression IDENTIFIER function_call_expression ';'

either of these can appear in the same context in a variety of ways, so the compiler has no way to tell which it is looking at until it gets to the ; in the variable_definition or the { in the compound_statement in the function_definition. As a result it has no way of telling whether its processing a function_definition_arguments or a function_call_expression, leading to the reduce/reduce conflicts you see.

To find this sort of problem yourself, you need to run bison with the -v option to produce a .output file showing the state machine it built. You then look at the states with conflict and backtrack to see how it gets to those states. In your example, state 280 has (two of) the reduce/reduce conflicts. One way of getting there is state 177, which is parsing function_definition_arguments and function_call_expression in parallel -- the parser is in a state where either is legal. State 177 comes from state 77, which comes from state 26, which shows the two rules I reproduced above.