1
votes

I'm trying to learn Flex & Bison. I've read through material, and I understand how it works at a theoretical level. However, I can't seem to even implement the most basic thing without hitting a mental block. (Note: I haven't taken any compiler courses or anything like that...this is my first exposure to any of this stuff). I think once I see this super basic thing implemented, I'll be able to move on and understand much more easily.

Basically, all I'm trying to do is write a program that upon seeing type my_type /// some text will call my_type's function called "set_text", and set the text to what's after that comment. Rather, my Bison grammar will call the function my_type.set_text(some text); I realize I could do this easily without using Flex and Bison, but the point is to learn.

I already have the files set up correctly...all I need to implement is the token passing (from Flex) and the action taken (from Bison).

My Flex token passing so far:

"\/"{3}               { return COMMENT; }

My Bison token grabbing so far

%token COMMENT

and that's seriously all I can come up with. I know what else I need...I just can't figure out how to do it. I know that I need:
a) to pass type and my_type as something
b) To come up with a "rule" in Bison to handle this stuff and call the function correct function

Any help? Am I way off already?

UPDATE (further thoughts on how to do this): Maybe my Bison file should include a rule like

commented_variable:                           {($2).set_text($4);}
    IDENTIFIER NAME COMMENT COMMENT_TEXT                      

Thus my Flex file would need to pass it those tokens? Am I on the right track?

2
Also, I apologize if this seems like the wrong forum to post this in. I couldn't find any sites more suitable. Let me know if you think another Stack Exchange site would have been a better choice!Casey Patton

2 Answers

1
votes

Let me suggest some stuffs. Though it isn't impossible to process COMMENT and COMMENT_TEXT individually using start condition of flex, I suppose it's easier to handle them at a time.
bison source will be like the following(fictitious code):

%union {
  name_type *name;
  char const *comment;
}
%token <name> NAME
%token <comment> COMMENT
%%
commented_variable: IDENTIFIER NAME COMMENT {$2->set_text($3);}

From lexical viewpoint, your IDENTIFIER and NAME seem indistinguishable. So I sorted them out in user code(not lexically). flex source will be like the following:

"///".*                 {
                        yylval.comment = strdup( yytext + 3 );
                        return COMMENT;
                        }
[A-Za-z_][A-Za-z_0-9]*  {
                        name_type *n = lookup_name( yytext );
                        if ( n ) {
                          yylval.name = n;
                          return NAME;
                        }
                        return IDENTIFIER;
                        }

However, the above code still needs appropriate name_type and lookup_name, and freeing the pointer returned from strdup.
If you don't have much experience in flex/bison, I'd recommend first confirming lexer sufficiently. For example, I suggest confirming that expected tokens are recognized with simple code like
int main() { while ( yylex() ) {} } and

"///".*                 printf("comment %s\n", yytext);
[A-Za-z_][A-Za-z_0-9]*  printf("symbol %s\n", yytext);

Similarly, as for bison code, I recommend first resolving grammatical issue like shift/reduce-conflict, and confirming that the grammar is recognized properly with simple code like:

commented_variable: IDENTIFIER NAME COMMENT { puts("OK"); }
0
votes

Given that you're actually using the text following the delimiter, I would not use the word "comment" for anything you have above. That said, what you've put into your update is pretty much the right idea.