1
votes

First of all I need to say that I am very new to Flex and Bison and I am a bit confused. There is a school project that want us to create a compiler using Flex and Bison for some kind of CLIPS language. My code has a lot of problems but the main one is that whatever i type i see a syntax error while the result should be something else. The ideal scenario would be to fully work for the language CLIPS. EG when i write "4" it get syntax error. Reading my code maybe will get you understand this better. If i write "test 3 4" it doesnt show syntax error but it counts it as an unknown token and thats wrong again..i'm completely lost. the code is a prototype by the school and we need to do some changes. if you have any questions dont hesitate to ask. THank you! P.S.: dont mind the comments, they are in greek. FLEX CODE:

%option noyywrap


/* Kwdikas C gia orismo twn apaitoumenwn header files kai twn metablhtwn.
   Otidhpote anamesa sta %{ kai %} metaferetai autousio sto arxeio C pou
   tha dhmiourghsei to Flex. */

%{

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* Header file pou periexei lista me ola ta tokens */
#include "token.h"

/* Orismos metrhth trexousas grammhs */
int line = 1;

%}


/* Onomata kai antistoixoi orismoi (ypo morfh kanonikhs ekfrashs).
   Meta apo auto, mporei na ginei xrhsh twn onomatwn (aristera) anti twn,
   synhthws idiaiterws makroskelwn kai dysnohtwn, kanonikwn ekfrasewn */
/*  dimiourgia KE simfona me ta orismata tis glossas */

DELIMITER       [ \t]+
INTCONST        [+-]*[1-9][0-9]*
VARIABLE    [?][A-Za-z0-9]*         
DEFINITIONS [a-zA-Z][-|_|A-Z|a-z|0-9]*
COMMENTS    ^;.*$


/* Gia kathe pattern (aristera) pou tairiazei ekteleitai o antistoixos
   kwdikas mesa sta agkistra. H entolh return epitrepei thn epistrofh
   mias arithmhtikhs timhs mesw ths synarthshs yylex() */
/*  an sinantisei diaxoristi i sxolio to agnoei, an sinantisei akeraio,metavliti i orismo ton emfanizei. se kathe alli periptosi ektiponei oti den anagnorizei to token, ti grammi pou vrisketai kai to string pou dothike */

%%

{DELIMITER}     {;}
"bind"      { return BIND;}
"test"      { return TEST;}
"read"      { return READ;}
"printout"  { return PRINTOUT;}
"deffacts"  { return DEFFACTS;}
"defrule"   { return DEFRULE;}
"->"        { return '->';}
"="     { return '=';}
"+"     { return '+';}
"-"     { return '-';}
"*"     { return '*';}
"/"     { return '/';}
"("     { return '(';}
")"     { return ')';}      
{INTCONST}      { return INTCONST; }
{VARIABLE}  { return VARIABLE; }
{DEFINITIONS}   { return DEFINITIONS; }
{COMMENTS}  {;}
\n              { line++; printf("\n"); }
.+      { printf("\tLine=%d, UNKNOWN TOKEN, value=\"%s\"\n",line, yytext);}
<<EOF>>     { printf("#END-OF-FILE#\n"); exit(0); }

%%

/* Pinakas me ola ta tokens se antistoixia me tous orismous sto token.h */

char *tname[11] = {"DELIMITER","INTCONST" , "VARIABLE", "DEFINITIONS", "COMMENTS", "BIND", "TEST", "READ", "PRINTOUT", "DEFFACTS", "DEFRULE"};

BISON CODE:

%{
/* Orismoi kai dhlwseis glwssas C. Otidhpote exei na kanei me orismo h arxikopoihsh
   metablhtwn & synarthsewn, arxeia header kai dhlwseis #define mpainei se auto to shmeio */
        #include <stdio.h>
    #include <stdlib.h>
        int yylex(void);
        void yyerror(char *);
%}

/* Orismos twn anagnwrisimwn lektikwn monadwn. */
%token INTCONST VARIABLE DEFINITIONS PLUS NEWLINE MINUS MULT DIV COM BIND TEST READ PRINTOUT DEFFACTS DEFRULE

%%

/* Orismos twn grammatikwn kanonwn. Kathe fora pou antistoixizetai enas grammatikos
   kanonas me ta dedomena eisodou, ekteleitai o kwdikas C pou brisketai anamesa sta
   agkistra. H anamenomenh syntaksh einai:
                onoma : kanonas { kwdikas C } */
program:
        program expr NEWLINE { printf("%d\n", $2); }
        |
        ;
expr:
        INTCONST         { $$ = $1; }
    | VARIABLE  { $$ = $1; }//prosthiki tis metavlitis
        | PLUS expr expr { $$ = $2 + $3; }//prosthiki tis prosthesis os praksi
    | MINUS expr expr { $$ = $2 - $3; } //prosthiki tis afairesis os praksi
    | MULT expr expr { $$ = $2 * $3; }//prosthiki tou pollaplasiasmou os praksi
    | DIV expr expr { $$ = $2 / $3; }//prosthiki tis diairesis os praksi
    | COM       { $$ = $1; }//prosthiki ton sxolion
    | DEFFACTS expr { $$ = $2; }//prosthiki ton gegonoton
    | DEFRULE expr  { $$ = $2; }//prosthiki ton kanonon
    | BIND expr expr    { $$ = $2;}//prosthiki tis bind
    | TEST expr expr    { $$ = $2 ;}//prosthiki tis test
    | READ expr expr    { $$ = $2 ;}//prosthiki tis read
    | PRINTOUT expr expr    { $$ = $2 ;}//prosthiki tis printout
        ;

%%



/* H synarthsh yyerror xrhsimopoieitai gia thn anafora sfalmatwn. Sygkekrimena kaleitai
   apo thn yyparse otan yparksei kapoio syntaktiko lathos. Sthn parakatw periptwsh h
   synarthsh epi ths ousias typwnei mhnyma lathous sthn othonh. */
void yyerror(char *s) {
        fprintf(stderr, "Error: %s\n", s);
}


/* H synarthsh main pou apotelei kai to shmeio ekkinhshs tou programmatos.
   Sthn sygkekrimenh periptwsh apla kalei thn synarthsh yyparse tou Bison
   gia na ksekinhsei h syntaktikh analysh. */
int main(void)  {
        yyparse();
        return 0;
}

TOKEN FILE:

#define DELIMITER 1
#define INTCONST 2
#define VARIABLE 3
#define DEFINITIONS 4
#define COMMENTS 5
#define BIND 6
#define TEST 7
#define READ 8
#define PRINTOUT 9
#define DEFFACTS 10
#define DEFRULE 11

MAKEFILE:

all:
    bison -d simple-bison-code.y
    flex mini-clips-la.l
    gcc  simple-bison-code.tab.c lex.yy.c -o B2
    ./B2
clean:
    rm simple-bison-code.tab.c simple-bison-code.tab.h lex.yy.c B2
1
interesting language! how did you intend to give the number 0? :DAntti Haapala
Also, I've never seen greek written with latin alphabet like that - it looks awful :DAntti Haapala
@antti: i think it's pretty common. With a bit of practice, it's easy to decode.rici
@AnttiHaapala to give the number 0 where? :P well actually its greek combined with english. the so called "greeklish" and yeah i agree they are awfull!Miltos Taramanidis
He means that your pattern for integers won't recognise the integer 0. That's probably not what you wanted.rici

1 Answers

3
votes
  1. Your top-level rule is:

    program:
        program expr NEWLINE 
    

    which cannot succeed unless the parser sees a NEWLINE token. But it will never see one, because your lexical scanner never sends one; when it sees a newline, it increments the line count but doesn't return anything.

  2. All your tokens are considered invalid because your lexical scanner uses its own definitions of the token values. You shouldn't do that. The parser generator (bison/yacc) will generate a header file containing the correct definitions; that is, the values it is expecting to see.

  3. There are various other problems, probably more than I noticed. The most important is that you should not call exit(0) in the <<EOF>> rule, since that will mean that the parser can never succeed; it does not succeed until it is passed an EOF token. In fact, you should not normally have an <<EOF>> rule; the default action is to return 0 and that is pretty well the only action which makes sense.

  4. Also, '->' is not a correct C literal. The compiler would have complained about it if you had enabled compiler warnings (-Wall), which you should always do, even if you are compiling generated code.

  5. And your scanner's last pattern, intended to trigger on bad tokens, is .+, which will match the entire line, not just the erroneous character. Since (f)lex scanners accept the pattern with the longest match, most of your other patterns will never match. (Flex usually warns you about unmatchable patterns. Didn't you get such a warning?)

    The fallback pattern should be .|\n, although you can use . if you are absolutely sure that every newline will be matched by some rule. I like to use %option nodefault, which will cause flex to warn me if there is some possible input not matched by any rule.