2
votes

I recently started learning basic Flex and Bison because I have to make a parser for simple (but not too simple) grammar. I decided to make a simplified Java language in my grammar. I made the .l and the .y files and everything compiles without error (I'm using gcc to compile). The problem is that every time I run the generated program I get Syntax Error, even with a simple input like: private class Something{}. The only time I do not get a Syntax Error is when I enter an empty line (\n). I've been struggling with this for a few days now and I suspect there's somewhere a problem in my grammar but I can't seem to find it. Of course there may be other problems too because I'm pretty new to Flex and Bison.

Any help would be really appreciated.

Here are the .l and .y files:

java.l

%{
#include "java.tab.h"
%}

%option noyywrap

%%

"\n" return 0;
[ \t] ;

"private" {return PRIVATE;}
"public" {return PUBLIC;}
"protected" {return PROTECTED;}
"implenets" {return IMPLEMENTS;}
"extends" {return EXTENDS;}
"class" {return CLASS;}
"interface" {return INTERFACE;}
"if" {return IF;}
"while" {return WHILE;}
"return" {return RETURN;}
"true" {return BOOLEAN;}
"false" {return BOOLEAN;}

[A-z][a-z0-9]* {return NAME;}

"\""[A-z0-9]*"\"" {return STRING;}
"-"?[1-9][0-9]* {return INT;}

"+"|"-"|"*"|"/"|"="|"==" {return OPERATOR;}

%%

java.y

%{
#include <stdio.h>

int cond=0;
int loops=0;
int assigns=0;
int funcs=0;
int classes=0;

void yyerror(const char* msg){printf("Error: %s\n", msg);}
%}


%token PUBLIC
%token PRIVATE
%token PROTECTED
%token NAME
%token IMPLEMENTS
%token EXTENDS
%token CLASS
%token INTERFACE
%token IF
%token WHILE
%token STRING
%token BOOLEAN
%token OPERATOR
%token RETURN 
%token INT

%%

Code: Class Code | /*empty*/ {printf("classes: %d\n", classes); printf("functions: %d\n", funcs); printf("conditions: %d\n", cond); 
                                printf("loops: %d\n", loops); printf("assign operations: %d\n", assigns);} ;
Class: Modifier ClassType NAME Extra '{' Functions '}' ;
Modifier: PUBLIC | PRIVATE | PROTECTED ;
ClassType: CLASS | INTERFACE ;
Extra: IMPLEMENTS NAME | EXTENDS NAME | /*empty*/ ;
Functions: Function Functions | /*empty*/ ;
Function: Type NAME '(' Arguments ')' '{' Commands '}' {funcs++;} ;
Arguments: Argument Arguments | /*empty*/ ;
Argument: Type NAME Separator ;
Type: STRING | INT | BOOLEAN ;
Separator: ',' | /*empty*/ ;
Commands: Command Commands | /*empty*/ ;
Command: Condition | Loop | Assignment | Return ;
Condition: IF '(' Comparison ')' '{' Commands '}' {cond++;} ;
Loop: WHILE '(' Comparison ')' '{' Commands '}' {loops++;} ;
Comparison: NAME OPERATOR INT | NAME OPERATOR NAME | INT OPERATOR NAME ;
Assignment: NAME '=' Type ';' {assigns++;} ;
Return: RETURN RetVal ';' ;
RetVal: NAME | Type ;

%%

int main()
{
   yyparse();
   return 0;
}
2

2 Answers

4
votes

Here's a start:

First, the default rule provided by flex just echoes the character unmatched by any other rule. { and } are not matched by any rule, so they will be echoed and never be seen by bison, which makes it impossible for the production Class to match. A simple solution is to put a default rule as the last flex rule:

. { return yytext[0]; }

Second, [A-z] is not the same as [A-Za-z] because Z and a are not consecutive in ASCII. I recommend using [[:alpha:]] for alphabetic characters and [[:alnum:]] for alphanumerics, but there's nothing wrong with [A-Za-z] and [A-Za-z0-9]. In both cases, you might want to allow other characters, such as _. (That's not causing you any immediate problems, it's just a note.)

Third, you spelled "implements" incorrectly.

1
votes

For general parser debugging, you may find it useful to compile your parser (the java.tab.c file) with -DYYDEBUG and stick the line yydebug=1; into your main function before calling yyparse.

This will cause the parser to print all the tokens it reads and actions it takes as it does them, allowing you to see what it is doing and usually showing what is happening and why you are getting unexpected syntax errors for inputs you think are correct.