0
votes

I am learning flex & bison and I am stuck here and cannot figure out how such a simple grammar rule does not work as I expected, below is the lexer code:

%{

#include <stdio.h>
#include "zparser.tab.h"

%}

%%

[\t\n ]+        //ignore white space

FROM|from           { return FROM;   }
select|SELECT       { return SELECT; }
update|UPDATE       { return UPDATE; }
insert|INSERT       { return INSERT; }
delete|DELETE       { return DELETE; }
[a-zA-Z].*          { return IDENTIFIER; }
\*                  { return STAR;   }

%%

And below is the parser code:

%{
#include<stdio.h>
#include<iostream>
#include<vector>
#include<string>
using namespace std;

extern int yyerror(const char* str);
extern int yylex();


%}

%%

%token SELECT UPDATE INSERT DELETE STAR IDENTIFIER FROM;


ZQL     : SELECT STAR FROM  IDENTIFIER { cout<<"Done"<<endl; return 0;}
        ;

%%

Can any one tell me why it shows error if I try to put "select * from something"

2
Careful of tagging. The Flex tag is used for the Adobe/Apache UI Framework. The Flex-lexer tag is used for the lexical analyzer. - JeffryHouser
Shows what error, at what token? - user207421

2 Answers

2
votes

[a-zA-Z].* will match an alphabetic character followed by any number of arbitrary characters except newline. In other words, it will match from an alphabetic character to the end of the line.

Since flex always accepts the longest match, the line select * from ... will appear to have only one token, IDENTIFIER, and that is a syntax error.

1
votes

[a-zA-Z].* { return IDENTIFIER; }

The problem is here. It allows any junk to follow an initial alpha character and be returned as IDENTIFIER, including in this case the entire rest of the line after the initial ''s.

It should be:

[a-zA-Z]+          { return IDENTIFIER; }

or possibly

[a-zA-Z][a-zA-Z0-9]*          { return IDENTIFIER; }

or whatever else you want to allow to follow an initial alpha character in your identifiers.