0
votes

I have to write a parser for a mini language and I have some problems. Here is the YACC file:

%{
#include <stdio.h>
int yylex();
void yyerror(char *s);

%}

%union {int num; char id; double d; char *s;}
%start program
%token <num> DIGIT
%token <s> IDENTIFIER
%token <num> NO
%type <num> term condition
%type <s> expression assignstmt stmt
%%
program : "##LAZY###" "vars" decllist cmpdstmt   {;}
        ;
decllist : declaration                           {;}
        | declaration decllist                  {;}
        ;
declaration : "in" IDENTIFIER                    {int $2;}
            | "in" '[' NO ']' IDENTIFIER         {int $5[$3];}
            ;                   
cmpdstmt : "exec" stmtlist "stop"                {;}
stmtlist : stmt                                  {;}
        | stmt stmtlist                         {;}
        ;
stmt : assignstmt                                {;}
    | ifstmt                                    {;}
    | whilestmt                                 {;} 
    ;
assignstmt : IDENTIFIER '=' expression           {$1 = $3;}
        ;
expression : expression '+' term                 {$$ = $1 + $3;}
        | term '+' term                       {$$ = $1 + $3;}
        ;
term : DIGIT                                     {$$ = $1;}
    | IDENTIFIER                                {$$ = $1;}
    ;
ifstmt : "if" '(' condition ')' '{' stmt '}'     {if($3){$6;}}
    ;
whilestmt : "wh" '(' condition ')' '{' stmt '}'  {while($3){$6;}}
        ;
condition : expression "<" expression            {$$ = ($1 < $3);}
        | expression "<=" expression           {$$ = ($1 <= $3);}
        | expression "==" expression           {$$ = ($1 == $3);}
        | expression "!=" expression           {$$ = ($1 != $3);}
        | expression ">=" expression           {$$ = ($1 >= $3);}
        | expression ">" expression            {$$ = ($1 > $3);}
        ;
%%

int main() {
    printf("WORKING\n");
    return yyparse();
}

void yyerror(char*s) { printf("%s\n", s); }

But when I try to compile it with: cc lex.yy.c y.tab.c I receive the following errors and I don't know how to fix them or why I receive them:

lazy.y: In function ‘yyparse’:
lazy.y:21:19: error: expected ‘)’ before ‘.’ token
declaration : "in" IDENTIFIER                    {int $2;}
                ^
lazy.y:22:19: error: expected ‘)’ before ‘.’ token
            | "in" '[' NO ']' IDENTIFIER         {int $5[$3];}

I will post also the Lex file if is needed.

2

2 Answers

0
votes

from

declaration : "in" IDENTIFIER                    {int $2;}
       | "in" '[' NO ']' IDENTIFIER         {int $5[$3];}
       ;                   

the error comes from {int $2;} and {int $5[$3];}

what did you expect with them ?

That is legal :

declaration : "in" IDENTIFIER                    {char * s = $2;}
            | "in" '[' NO ']' IDENTIFIER         {int i =  $5[$3];}
            ;

except of course these variables are local so just that has no real interrest

0
votes

You are aware already that the YACC family of parser generators work by generating C code, which you then compile. What may not be clear is that when it comes to semantic actions, they serve basically as template engines. They are perfectly willing to produce garbage code if that's what the action template you present corresponds to. You likely won't find out that they have done so until you try to compile the resulting code.

Additionally, your compiler and parser generator are cooperating to show you the lines of YACC code responsible for the ultimate C syntax errors that result in your case. This is very helpful for determining where you need to apply a fix, but it doesn't explain the nature of the problem very well. This is about the best it can do, however, because the compiler only knows why the C code is wrong, not why the YACC code from which it was derived is wrong.

So why is the YACC code wrong? Several reasons, but first and foremost because a semantic action intended to set the semantic value of a production must do so by assigning to the special symbol $$. A C statement that starts with a type name, such as is produced by your particular actions, is instead a declaration. Even if it happened to be a valid one (which definitely will not be the case here) it would not set a semantic value. Instead, you want something more like

{ $$ = $2; }

and

{ $$ = $5[$3]; }

BUT you have a problem with data types. With $2 in the first action and $5 in the second action corresponding to tokens of the same type, there is no way that the two actions above are both compatible with the (undeclared) type of your declaration production. As a wild guess, perhaps you were trying to clean that up by casting one or both to type int, ala $$ = (int) $2;. Although that might take care of your compilation errors, it leaves you with a result that you cannot use, because you need to know the original type, and also because converting from a pointer to an int may be inherently lossy.

There is no quick and easy fix. You need to rethink your approach, paying more careful attention to data types and how to preserve and convey type information.

Update:

It occurs to me that perhaps you were not trying to set a semantic value at all, but rather create a parser that generates C code. If that's the case then you've committed a frame error. Semantic actions contribute to the code for the generated parser itself -- that is, code used when parsing the language. If your intention is to translate the custom language into equivalent C code, then the translated code would need to be output by the parser, not part of the parser. You might achieve that by printing the wanted statements to a file, for example, but a more common approach is to have the parser build an abstract syntax tree, which you process after parsing is complete.