2
votes

I generated with flex a lexer.

[ \t\n\r\v]          /* skip whitespace */

[_a-zA-Z]([_a-zA-Z]|[0-9])*  printf("IDENT\n");
[0-9]+        printf("INTEGER\n");
[0-9]+\.      printf("DOUBLE\n");

Now i want to write my own parser in C, but I don't know how I get the tokens from the lexer. Do I have to include "lexer.c" an call yylex()? Then I have to return enum types instead of calling printf(). What is the best way to do this without using bison/yacc?

1
Isn't there really nice free GNU documentation for the GNU versions of these tools? I'm pretty sure there's also an O'Reilly book, which probably has an online version. (I'm posting this as a comment rather than an answer because I don't have references right off, but I seem to remember seeing them..)R.. GitHub STOP HELPING ICE

1 Answers

1
votes

You will need to expand that grammar before you are finished, but...

  • Yes, you will replace the printf() statements with appropriate return statements
  • (Or, more likely/better, keep the print statements and add return statements).
  • You will wrap the actions in '{ ... }' braces.
  • You will need to consider how you are going to communicate the token type and the token value back to your parser.

The standard way is to return the token type from yylex() - the function that is generated by Flex. There is a global variable, ylval, which can be used to convey the token value. You can control its type. Note that somewhere along the way, you will need to specify the token numbers (token types). That can be an enumeration or a series of #defines. Classically, the information is provided to the lexical analyzer by the parser. That is, Yacc provides a list of the token numbers that it expects to use, and the Flex uses those numbers (or, more accurately, you use those numbers in the return statements in the code generated by Flex).

To get the tokens from the lexer to your parser, you have to call yylex(); you usually compile that separately from your parser, though you probably could include its generated source into your parser file if you really wanted to.