I am trying to write a parser that recognizes a particular syntax. I am pretty far and most of the stuff works. I came across a problem with the Bison parser though when trying to enable it to recognize two similar but distinct constructs. Here are the two constructs I want it to recognize:
type_1 IS ENUMERATION OF
BEGIN
element_1,
element_2 := 4
END type_1;
type_2 IS ENUMERATION SIZE 8 OF
BEGIN
element_1,
element_2 := 4
END type_2;
Basically this is for a enum like declaration that can be given with or without a bitsize parameter for the elements.
The Bison file I wrote contains two alternatives for the two versions. Each of them separate, when either being alone, or when it comes before the other one, works well. However when I put both description into the Y file, it only recognizes one of my declarations or the other.
Here are the lines of Bison code:
enum_type : sized_enum_type { $$=$1; P3; }
| nonsized_enum_type { $$=$1; P3; }
;
sized_enum_type : identifier[L] {P1(37,$L->Get_Line(),$L->Get_Column());}
IS[I] {P2(38,$I.ln,$I.clmn);}
ENUMERATION[N] {P2(43,$N.ln,$N.clmn);}
SIZE[Z] {P2(42,$Z.ln,$Z.clmn);}
INTEGER_VALUE[V] {P2(39,$V.ln,$V.clmn);}
OF[O] {P2(40,$O.ln,$O.clmn);}
TBEGIN[B] {P2(16,$B.ln,$B.clmn);}
enum_element_list[S] {P2(41,$S->Get_Line(),$S->Get_Column());}
TEND[E] {P2(16,$E.ln,$E.clmn);}
identifier[R] {P2(15,$R->Get_Line(),$R->Get_Column());}
SEMICOLON {$$= new AST::AST_Type_Enum( (AST::AST_Code_Identifier *) $L, $V.value ); $$->_Set_Line_And_Column( $I.ln, $I.clmn); P3;}
;
nonsized_enum_type : identifier[L] {P1(37,$L->Get_Line(),$L->Get_Column());}
IS[I] {P2(38,$I.ln,$I.clmn);}
ENUMERATION[N] {P2(39,$N.ln,$N.clmn);}
OF[O] {P2(40,$O.ln,$O.clmn);}
TBEGIN[B] {P2(16,$B.ln,$B.clmn);}
enum_element_list[S] {P2(41,$S->Get_Line(),$S->Get_Column());}
TEND[E] {P2(16,$E.ln,$E.clmn);}
identifier[R] {P2(15,$R->Get_Line(),$R->Get_Column());}
SEMICOLON {$$= new AST::AST_Type_Enum( (AST::AST_Code_Identifier *) $L, -1 ); $$->_Set_Line_And_Column( $I.ln, $I.clmn); P3;}
;
enum_element_list : one_enum { $$=$1; }
| one_enum COMMA enum_element_list { ((AST::AST_Enum_Value *)$1)->Append( (AST::AST_Enum_Value *) $3); $$=$1; }
;
one_enum : identifier[L] {$$= new AST::AST_Enum_Value( (AST::AST_Code_Identifier *) $L, -1 ); $$->_Set_Line_And_Column( $L->Get_Line(),$L->Get_Column()); P3;}
| identifier[L] {P1(17,$L->Get_Line(),$L->Get_Column());}
ASSIGNMENT_SHALLOW_COPY[A] {P2(42,$A.ln, $A.clmn);}
INTEGER_VALUE[I] {$$= new AST::AST_Enum_Value( (AST::AST_Code_Identifier *) $L, $I.value ); $$->_Set_Line_And_Column( $L->Get_Line(),$L->Get_Column()); P3;}
;
I dont think the C-statements on the right side of each rule part are necessary but I include them here nevertheless.
When I run the compiled parser on my sourcecode it complaints that it was expecting the token SIZE. When I exchange the rules for nonsized_enum_type and sized_enum_type (not the right side in the rule for enum_type, the complete source code order for the rules, move one rule below the other) then it recognizes the other form without size. When I include the SIZE 8 part in my code, then it complains that it was expecting the other form (OF expected).
So my question is: How is it possible to write a parser rule that recongizes both parts?
Shouldn't the generated code recognize that one rule is leading nowhere and then try the other one? It appears the second rule is never tried.
Thanks everyone