4
votes

I'm writing a BNF grammar for a very simple programming language and using Flex and Bison to compile.
I only have 3 variable and constant types: real, integer, string.
My .l file has a token definition for "ID" as follows:

DIGIT [0-9]
LETTER [a-zA-Z]
ID {LETTER}({LETTER}|{DIGIT})*


My .y file has a definition for an identifier like this:

identifier:
ID;

Now, I want to use the identifier definition to build variable and constant names. But I also want to limit assignment to data of the same type (e.g., I don't want a string assigned to an integer variable). So I created a few rules to separate each kind of variable and constant:

id_variable_string:
identifier;

id_variable_integer:
identifier;

id_variable_real:
identifier;

I did the same for constants. Now, in my language I have a section for constant declaration/definition and then a section for variable declaration. That is, constants are declared at the same time as they're assigned (something like "myConstant = 123") but variables have to be declared first, and then assigned a value in the section specifically designed for statements.
E.g., if I want an integer and a string variable, I'd have to declare them first:
STRING myStrVariable;
INTEGER myIntVariable;
And then, in the zone reserved for statements, I can choose to do an assignment (a statement can be an assignment, a decision, a selection, an output, etc.):

assignment: 
        id_variable_string ASSIGN_OPERATOR literal_string
        | id_variable_string ASSIGN_OPERATOR id_const_string 
        | id_variable_string ASSIGN_OPERATOR id_variable_string 
        | id_variable_string ASSIGN_OPERATOR concatenacion  
        | id_variable_integer ASSIGN_OPERATOR id_const_integer 
        | id_variable_integer ASSIGN_OPERATOR id_variable_integer  
        | id_variable_integer ASSIGN_OPERATOR expression 
        | id_variable_integer ASSIGN_OPERATOR literal_integer
        | id_variable_real ASSIGN_OPERATOR id_variable_real
        | id_variable_real ASSIGN_OPERATOR id_const_real
        | id_variable_real ASSIGN_OPERATOR expression
        | id_variable_real ASSIGN_OPERATOR literal_real
        ;

What I intend here is to explicitly say that a string variable can only be assigned a string literal, a concatenation of strings (using +), a string constant or another string variable. The same for integer variables and then for real variables, only that they can't be assigned a concatenation but an expression instead (math operations).
Concatenation is defined as follows:

concatenation:
        id_variable_string ADD_OPERATOR id_variable_string 
        | id_variable_string ADD_OPERADOR literal_string 
        | literal_string ADD_OPERADOR id_variable_string 
        | literal_string ADD_OPERADOR literal_string
        | id_const_string ADD_OPERADOR id_const_string  
        | id_const_string ADD_OPERADOR id_variable_string 
        | id_const_string ADD_OPERADOR literal_string 
        | literal_string ADD_OPERADOR id_const_string  
        | id_variable_string ADD_OPERADOR id_const_string
        ;

And expression is defined as:

expression: 
        expression ADD_OPERATOR term
        | expression SUBST_OPERADOR term
        | term
        ;

term:
        term MULTIP_OPERATOR factor
        | term DIVISION_OPERATOR factor
        | factor
        ;

factor:     
        id_variable_integer
        | id_variable_real
        | id_const_integer
        | id_const_real
        | literal_integer
        | literal_real
        | PARENTHESIS_OPEN expression PARENTHESIS_CLOSE
        ;

Now, this is what Bison is saying:


55 assignment: id_variable_integer ASSIGN_OPERATOR id_const_integer
56 | id_variable_integer ASSIGN_OPERATOR id_variable_integer
58 | id_variable_integer ASSIGN_OPERATOR literal_integer
59 | id_variable_real ASSIGN_OPERATOR id_variable_real
60 | id_variable_real ASSIGN_OPERATOR id_const_real
62 | id_variable_real ASSIGN_OPERATOR literal_real


State 50 conflicts: 1 shift/reduce
State 76 conflicts: 14 shift/reduce
State 130 conflicts: 2 shift/reduce
State 131 conflicts: 1 shift/reduce
State 133 conflicts: 1 shift/reduce
State 134 conflicts: 1 shift/reduce
State 135 conflicts: 1 shift/reduce
State 137 conflicts: 1 shift/reduce
State 138 conflicts: 1 shift/reduce


I'm assuming something in my grammar is wrong but I'm not sure what exactly.

1

1 Answers

4
votes

You said:

So I created a few rules to separate each kind of variable and constant:

id_variable_string:
identifier;

id_variable_integer:
identifier;

id_variable_real:
identifier;

And this was your problem. There is nothing syntactically to distinguish an id_variable_string from an id_variable_integer, so you have (at least two) wasted rules. This is what it is complaining about. It has no clue when it gets an identifier whether it should be treating it as an id_variable_string or an id_variable_integer.

You have to handle the type conflicts differently — a semantic check (not a syntactic check) that the type associated with the identifier is consistent with the types of the other identifiers in the expression.