3
votes

I am using a char* as YYSTYPE in a compiler built with flex and bison. The line

#define YYSTYPE char*

is at the top of my grammar file. A few of the tokens in my lexer need to pass the entire string that they match to my grammar, and the others just need to pass their token, so this works well for me. I do this sort of thing in my lexer:

[(foo|bar)]    {yylval = *strdup(yytext); return FOOBAR;}

In my grammar, I use them with productions like this one:

fb:
    FOOBAR
    {
        sprintf($$, "%s", &$1);
    }
    ;

This sets the value of $$ to the first character in the original matched token. I (probably) understand why, since a dereferenced char* is a char, but the steps I took to fix it caused problems. For instance, removing the & from the sprintf() line causes a segfault. Removing the * from the assignment causes a "makes integer from pointer without a cast". What do I do? I think the problem lies in the assignment to yylval.

3

3 Answers

4
votes

There are several problems with what you are doing. First and foremost, since YYSTYPE is a char pointer, there is actually no space allocated for a string. So when you do sprintf($$, "%s", &$1), you try to print a string into a pointer that is not initialized ($$ is a pointer, but not initialized to anything, so it can point to anywhere in memory.)

Another problem might be your use of &$1 in the sprintf. It takes the address of the pointer, not the actual string the pointer is pointing to.

A third problem is you are using strdup in the lexer, which allocates memory. But you never free it somewhere, creating a memory leak.

The fourth and final problem is why you only get a single character, and you are actually lucky you get that, and that because while strdup(yytext) is returning a copy of a string, the star in front of it returns the dereferenced pointer which is a char. So you set the pointer to a single character.

Edit: I hope it all makes sense, it's late and I might have a glass of wine or two...

4
votes

Change the assignment back to yylval = strdup(yytext), change sprintf(...) to $$ = yylval. Make sure the YYSTYPE is defined in your parser (.y) file, and that that header is created and imported into your lexer (.l) file.


I had hoped to use just YYSTYPE, but I couldn't get that to work, so use %union{}.
After experimenting and going back a bit, I got it to work with these changes:

In your parser.y:

%{
%}

%output "parser.c"
%defines "parser.h"

%union {
    char *str;
}

%type <str> fb
%start fb

%token FOOBAR

%%
fb: FOOBAR { $$ = yylval.str; }
%%

In your lexer.l:

%{
#include <string.h> 
#include "parser.h"
%}

%option outfile="lexer.c"
%option header-file="lexer.h"

%%
[(foo|bar)] { yylval.str = strdup(yytext); return FOOBAR; }
%%

Note:

  1. You will need to define yyerror, yywrap, and main somewhere.
  2. As it stands, this doesn't free the string, you'll need to figure out where best to do that.
3
votes

I solved that with following (both in .l and .y before .tab.h #include):

#ifndef YYSTYPE
# define YYSTYPE char*
#endif