Depending on the complexity of your configuration language, you might be better off with a one-pass parser rather than creating an AST and then walking the tree. But both approaches are completely valid.
Probably you should spend a few minutes (or hours :) ) reading the bison manual. Here, I'll just focus on the general approach and the bison features you might use.
The most important one is the ability to pass extra parameters into your parser. In particular, you'll want to pass a reference or pointer to an object which will contain the parsed configuration. You need the additional output parameter because the parser itself will only return a success or failure indication (which you need as well).
So here's a simple example which just constructs a dictionary of names to strings. Note that unlike the author of the tutorial you mention, I prefer to compile both the scanner and the parser as C++, avoiding the need to extern "C"
interfaces. This works fine with current versions of flex
and bison
, as long as you don't try to put non-POD objects onto the parser stack. Unfortunately, that means that we can't use std::string directly; we need to use a pointer (and we can't use a smart pointer either.)
file scanner.l
%{
#include <string>
#include "config.h"
using std::string;
%}
%option noinput nounput noyywrap nodefault
%option yylineno
// Set the output file to a C++ file. This could also be done on the
// command-line
%option outfile="scanner.cc"
%%
"#".* ; /* Ignore comments */
[[:space:]] ; /* Ignore whitespace */
[[:alpha:]̣_][[:alnum:]_]* { yylval = new string(yytext, yyleng); return ID; }
[[:alnum:]_@]+ { yylval = new string(yytext, yyleng); return STRING; }
["][^"]*["] { yylval = new string(yytext+1, yyleng-2); return STRING; }
. { return *yytext; }
Now the bison file, which only recognizes assignments. This requires bison v3; minor adjustments will be necessary to use it with bison v2.7.
config.y
%code requires {
#include <map>
#include <string>
#include <cstdio>
using Config = std::map<std::string, std::string>;
#define YYSTYPE std::string*
extern FILE* yyin;
extern int yylineno;
int yylex();
void yyerror(Config&, const char*);
}
%output "config.cc"
%defines "config.h"
%parse-param { Config& config }
%token ID STRING
%destructor { delete $$; } ID STRING
%%
config: %empty
| config assignment
;
assignment: ID '=' STRING { config[*$1] = *$3;
delete $1; delete $3;
}
| ID '=' ID { config[*$1] = config[*$3];
delete $1; delete $3;
}
%%
#include <iostream>
#include <cstring>
void yyerror(Config& unused, const char* msg) {
std::cerr << msg << " at line " << yylineno << '\n';
}
int main(int argc, const char** argv) {
if (argc > 1) {
yyin = fopen(argv[1], "r");
if (!yyin) {
std::cerr << "Unable to open " << argv[1] << ": "
<< strerror(errno) << '\n';
return 1;
}
} else {
yyin = stdin;
}
Config config;
int rv = yyparse(config);
if (rv == 0)
for (const auto& kv : config)
std::cout << kv.first << ": \"" << kv.second << "\"\n";
return rv;
}
To compile:
flex scanner.l
bison config.y
g++
Try it out:
$ cat sample.config
a=17
b= @a_single_token@
c = "A quoted string"
d9 =
"Another quoted string"
$ ./config sample.config
a: "17"
b: "@a_single_token@"
c: "A quoted string"
d9: "Another quoted string"