0
votes

I have scanner and parser ready, using flex and bison.

The parser is building a tree directly in the actions, and to do so I created a struct called STreeNode and I am using

#define YYSTYPE_IS_DECLARED
typedef STreeNode* YYSTYPE;

The struct is:

typedef struct tagSTreeNode
{
    EOperationType type;
    int count;
    struct tagSTreeNode **children;
    char *string;
} STreeNode;

There are like 40 tokens, and for every rule I have something like

unlabeled_statement:
        assignment                                                          {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | function_call_statement                                           {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | goto                                                              {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | return                                                            {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | conditional                                                       {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | repetitive                                                        {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        | empty_statement                                                   {$$ = createNode(eUNLABELED_STATEMENT, 1, $1);}
        ;

The signature for the createNode function is

STreeNode *createNode(EOperationType type, int count, ...) {

The tree is working fine. The problem is accessing the real value for variable names, function names, etc. Since YYSTYPE is a struct, $x does not have the string value I want to save on the char * string element in the struct.

I have a %token called IDENTIFIER and another called INTEGER, and those should receive the values I want.

Researching, I discovered that I could try and use a union { } to have every token of a specific type. Maybe that could help? And if so, I would necessarily need to specify the type every single token? How can that be implemented?

What about yytext? Couldn't that be used to achieve this goal?

Thank you!

--- EDIT --

So I've created

%union {
    char *string;
    STreeNode *node;
}

and specified every terminal and non terminal type to be one of those. The nodes are still working, but the strings using ($1 for example) are returning null.

Do I need to change anything in the scanner as well? My scanner has:

[a-zA-Z][a-z0-9A-Z]*        { return IDENTIFIER; }
[0-9]+                      { return INTEGER; }

Thanks again.

1
If you are using bison, why is this tagged yacc? - Scott Hunter
Just a small question, unrelated to your problem, but why do you create nodes for things that doesn't need it? Like instead of creating a new node for conditioal why not just simply set $$ to $1? That will simplify your tree a little, and lead to way fewer nodes in it. - Some programmer dude
@ScottHunter The flex tag is for Apache Flex not the GNU lex clone, so I removed it. - Some programmer dude
As for your problem, you probably should read about the %union directive and token type names. There are many example on how to use these if you just search a little. - Some programmer dude
@Someprogrammerdude You are correct, but since this is for educational purposes, one of the things the teacher can check is how many nodes for a particular element, for example. BTW thank you for editing it. - luisforque

1 Answers

0
votes

If your tokens have a type set for them, the lexer needs to set yylval to the type in question. Something like:

[a-zA-Z][a-z0-9A-Z]*        { yylval.string = strdup(yytext); return IDENTIFIER; }
[0-9]+                      { yylval.string = strdup(yytext); return INTEGER; }