Can you explain, maybe in an answer, exactly what you did, and how it worked, because as far as I can tell, and as far as I have tested the question, it shouldn't work as you say.
I took your code verbatim, creating files grammar.y
and lexer.l
. I then compiled the code. I'm working on Mac OS X 10.11.4, using GCC 6.1.0, Bison 2.3 (disguised as yacc
) and Flex 2.5.35 (disguised as lex
).
$ yacc -d grammar.y
$ lex lexer.l
$ gcc -o gl y.tab.c lex.yy.c
$ ./gl <<< 'a'
0
$
I subsequently made two changes. In grammar.y
, I changed main()
to:
int main(void) {
#if YYDEBUG
yydebug = 1;
#endif
yyparse();
return 0;
}
and in lexer.l
, I changed the default character rule to:
\n|. yyerror("invalid character");
(The .
doesn't match newline, so the newline after the a
in the input was echoed by default in the original output.)
With a similar compilation, the output becomes:
$ ./gl <<< 'a'
0
invalid character
$
With the compilation specifying -DYYDEBUG
too:
$ gcc -DYYDEBUG -o gl lex.yy.c y.tab.c
$
the output includes useful debugging information:
$ ./gl <<< 'a'
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
0
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: invalid character
Now at end of input.
Stack now 0 2
Cleanup: popping nterm daaaa ()
$ ./gl <<< 'aa'
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
0
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: Next token is token AAA ()
syntax error
Error: popping nterm daaaa ()
Stack now 0
Cleanup: discarding lookahead token AAA ()
Stack now 0
$
The second a
in the input correctly triggers a syntax error (it isn't allowed by the grammar). Other characters are permitted, generate a 'invalid character' message, and are otherwise ignored (so ./gl <<< 'abc'
generates 3 invalid character messages, one for the b
, one for the c
, and one for the newline).
Changing the assignment to yylval
in lexer.l
to:
yylval = 'a'; // atoi(yytext);
changes the number printed from 0 to 97, which is the character code for 'a'
in ASCII, ISO 8859-1, Unicode, etc.
I've been using a here string as the source of data. It would be equally feasible to have used a file as the input:
$ echo a > program
$ cat program
a
$ ./gl < a
Starting parse
Entering state 0
Reading a token: Next token is token AAA ()
Shifting token AAA ()
Entering state 1
Reducing stack by rule 1 (line 12):
$1 = token AAA ()
97
-> $$ = nterm daaaa ()
Stack now 0
Entering state 2
Reading a token: invalid character
Now at end of input.
Stack now 0 2
Cleanup: popping nterm daaaa ()
$
If you want to read files specified by name on the command line, you have to write more code in main()
to process those files.
atoi
? – Brian Tompsett - 汤莱恩atoi("a")
returns zero, and so doesatoi("b")
. There is no 'subtracting var by "a"' here, and nothing surefire about this bug in your code. – user207421./gl <<< 'a'
in Bash), it prints 0 and a couple of newlines. I'm not sure what you expected it to print. It will only read from standard input unless you take steps to organize it differently (by settingyyin
to point to a different file stream). – Jonathan Leffler