0
votes

I'm trying to write a parser using flex and bison.However, no matter how I modify the files, the error "syntax error in line 1" always appears. This is the test.vm file of yyinput:

$asfdfsdf
sdfsdfs
sdfsdfsd
sdfsdfsd
sfsdfd

this is the vtl4.l file:

%{
#include<stdio.h>
#include<string.h>
#include "context.h"
#include "bool.h"
#include "vtl4.tab.h"
%}
%%
(.|\n)* {yylval.string = yytext;return CONTENT;}
<<EOF>> {return FINAL;}
%%

This is the vtl4.y file:

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "bool.h"
#include "parser.h"
#include "context.h"
#include "vtl4.tab.h"

extern FILE * yyin;
extern FILE * yyout;
extern int yylex();
extern int yywrap();
%}

%union {
struct simpleNode *ast;
double d;
int i;
bool b;
char* string;
struct symbol *sym;
}

%type <ast> root stmts stmt

%token <string> CONTENT

%token FINAL

%%

root:stmts FINAL {printf("root\n");$$ = process($1);traverse($$);}
;

stmts: {printf("stmts:stmt\n");$$ = 0;}
|stmts stmt {printf("stmts:stmts stmt\n");$$ = add_ybrother($1,$2);}
;

stmt:CONTENT {printf("stmt\n");$$ = text($1);}
;

%%
int main(){
FILE *src;
src = fopen("test.vm","r");
yyin = src;
yyparse();
fclose(src);
return 1;
}

int yywrap(){
return 1;
}

Makefile:

CC=cc

FLEX=vtl4.l

BISON=vtl4.y

parse:vtl4.tab.c lex.yy.c
       $(CC) -o out *.c -ll


vtl4.tab.c:$(BISON)
      bison -d $(BISON) --report=all

lex.yy.c:$(FLEX)
        flex $(FLEX)

when I run ./out ,it will print the right result,but always says "line:1: error: syntax error" at last!I don't know why?

It works well when I edit the lex rule

<<EOF>> {return FINAL;}

to

<<EOF>> {yyterminate();}

and modify the yacc rule

root:stmts FINAL {printf("root\n");$$ = process($1);traverse($$);}

to

root:stmts {printf("root\n");$$ = process($1);traverse($$);}

but I don't know why?

1

1 Answers

2
votes

By using the return FINAL in the <<EOF>> rule the tokenizer will keep returning FINAL on an end-of-file. When flex is used in combination with bison you don't have to (and should not) make use of an explicit end-of-file token. Just rely on the 0 that will be returned by yylex on end-of-file, provided yywrap returns 1. This is exactly what yyterminate does for you too and that is why that works fine.

In this case the grammar is confronted with an endless stream of FINAL tokens that it can't handle. Of course you should not accommodate this endless stream in your grammar because the grammar will then be 'correct' but will never terminate.

I assume you are aware your tokenizer will match a complete file in a single CONTENT token, so even though your grammar supports a list of CONTENT tokens it will always see only one.

P.S: I found the problem by using the -t option to bison which adds a debug trace to the parsers and it showed that it choked on the second occurrence of FINAL.

P.S2: In the Makefile you used *.c in the compiler invocation for parse. This is quite dangerous as some random .c files may hang out in your directory. Better use $^ to refer to all files the rule depends on.

P.S3: As you have defined your own yywrap and main you can lose the -ll.