I'm writing a flex/bison parser, and need to identify the following pattern using Flex:
begin
/*some code*/
end
The above pattern may appear a few times inside a code. For example:
begin
/*some code #1*/
end
/*some code #2*/
begin
/*some code #3*/
end
It is important for me to identify the pattern in the lexer, but when using the following regex:
block "begin"[.\n]*"end"
{block} {return ID_BLOCK}
it catches the first begin and the LAST end. I would like to catch the first end. (please note#1: flex does not support all regex, so I cannot use regex zero length lookahead assertion please note #2: I think that the best way is to stop at the first match of "block" and not continue filling the buffer, I just dont know how to do it)
****EDIT**** The words begin and end are a simple example of unique words which will look like:
//BEGIN_SPECIAL_CODE
/*relevant code*/
//END_SPECIAL CODE
[.]
is a literal.
, so that is all it will match..
matches any character except a newline. So neither of those will match yourbegin
...end
block. Please include real code in you question. – riciend
mark the end of a block? What if the block contains the comment/* This comment extends the block */
? – rici[.\n]*
recognises dots and newlines; any number of them, but only those two characters. Regex operators are not special inside character classes (and that's not a flex quirk; you'll find it to be true in pretty well all regex libraries). But that's just a detail; I'm sticking with my answer. – rici