1
votes

I've been looking to recognise a language which does not fit the general Flex/Bison paradigm. It has completely different token rules depending on semantic context. For example:

main() {
    batchblock
    {
echo Hello World!
set batchvar=Something
echo %batchvar%
    }
}

Bison apparently supports recognition of these types of grammars, but it needs "Lexical Tie Ins" to support them effectively. It provides an interface for doing this -- but I'm confused as to how exactly I can supply different flex regexes depending on the context -- if this is even possible.

Thanks in advance :)

2

2 Answers

2
votes
I'm confused as to how exactly I can supply different flex regexes depending on the context

Flex has a state mechanism whereby you can switch it between different sets of regexes. The syntax for this is

%x name_of_state

at the top of the file (after the %}) and in your matching rules (after the first %%)

<name_of_state> *regex goes here*

Then this regex is only matched when in that state. There is also a global state <*> which can be used to match anything in any state.

There is more than one way to change states. For example, yy_pop_state and yy_push_state if you want to keep a stack of states. Alternatively you can use BEGIN(name_of_state). To go back to the initial state, use BEGIN(INITIAL).

1
votes

As it stands, specifically, if your special block is consistently signaled by 'batchblock {', this can be handled entirely inside of flex -- on the Bison (or byacc, if you want to make your life at least a little easier) side, you'd just see tokens that changed to something like 'BATCH_ECHO'.

To handle it inside of flex, you'd use its start conditions capability:

%x batchblock
%%

"batchblock"{ws}\{   { BEGIN(batchblock); }


<batchblock>echo     { return BATCH_ECHO; }
<batchblock>set      { return BATCH_SET;  }
/* ... */
<batchblock>\}       { begin(INITIAL);    }

The patterns that start with <batchblock> can only match in the "batchblock" state, which is entered by the BEGIN(batchblock);.