Flex sets the YY_STATE
to INITIAL
by default when yyscan_t
is called.
I'm trying to make a reentrant scanner that can start with user-specific state instead of INITIAL.
Here is the case
/* comment start //not passed into flex
in comment //first line passed into flex
end of comment*/ //second line passed into flex
For some reasons these 2 lines are separately fed into the reentrant scanner and the YY_STATE the line belongs to are known. What I need is to pass the comment state into reentrant flex and switch YY_STATE
to COMMENT
before start lexing in comment\n
.
My workaround are adding a dummy token in head of a line and passing the state as yyextra
into flex. Once the dummy token is recognized, switch to the specific state. Hence flex begins lexing the line with specific YY_STATE. However, adding a dummy token at the beginning of each line is time-consuming.
Here is the way I used to call reentrant flex:
yyscan_t scanner;
YY_BUFFER_STATE buffer;
yylex_init(&scanner);
buffer = yy_scan_string(inputStr, scanner);
yyset_extra(someStructure, scanner);
yylex(scanner);
yy_delete_buffer(buffer, scanner);
yylex_destroy(scanner);
Is it possible to set YY_STATE before yylex(scanner)
is called ?
yyscan_t
object. Are you doing that? – riciyyscan_t
? It's not sending an extra token that's slow: it's the overhead of creating and destroying scanner states. The scanner state is reusable without problems. – riciyyscan_t
will save a lot of redundant init and destroy. In this case the first parsed line isin comment\n
not/* comment start\n
. Flex won't know this line is in comment unless we let it know. – Shiang Dzayylex
? Do you just call it once to lex the entire line, or do you call it for each token? And how do you save the current lexical state? – rici