1
votes

If I have a language with basic things such as

a = expression

if expression then ...

while expression do ...

then I might have a grammar that looks like this: (pseudo code)

assignment: identifier Equals expression ;
if        : If expression Then ...
while     : While expression Do ...

Now the type of the first expression just needs to be type compatible with 'a' but the other two expressions must be of type boolean.

While it's easy to check the type of an expression everywhere, it seems to me that it would be very convenient to define in the grammar

boolExpression : expression;

and then my other rules would look like this:

assignment: identifier Equals expression ;
if        : If boolExpression Then ...
while     : While boolExpression Do ...

This would allow me to check that boolExpression returns a BOOL type and so I wouldn't have to add code to test every expression.

But have I turned the context free grammar into a context sensitive grammar by doing this? And if so, does it matter?

2

2 Answers

2
votes

If you have boolean variables, then you cannot syntactically verify whether an expression is boolean or not. If you want to catch all type checks, you will need some kind of semantic feedback, which will indeed make the language context-sensitive, and also make left-to-right parsing impossible unless you require variables to be declared before use. [See note 1] Semantic feedback of this form is generally considered to be a complication, since it produces a tight coupling between lexical analysis and parsing, but many parsers do it, either for convenience or necessity.

But perhaps you are content with detecting only certain contexts. If so, your grammar is certainly workable and does not create context sensitivity. if and while statements definitely provide a syntactically-determined boolean context. Arithmetic operators provide a determined numeric context (although you might need type checking if you have multiple arithmetic types.) But there are contexts which are not type-determined syntactically, such as assignment statements, equality operators, and function calls. So you will end up doing special-case type checks in various places in the walk.

Personally, I don't see the advantage to trying to complicate the grammar to try to do partial type-checking; I prefer to do a type validation/deduction pass after the parse. But in some use cases, a less flexible approach is perfectly reasonable.

Notes

  1. Some languages are very firm about declaration before use (C, for example) but others allow the programmer to logically order declaration of variables without having to worry about references to them (C++ class declarations, for example.)

    Personally, I prefer to not complicate the programmers life by insisting on declaration before use. Indeed, as a programmer I appreciate languages which can deduce variable types rather than insisting on redundant declarations. But tastes differ. In any event, mandatory declaration does not necessarily imply declaration before use, and many languages with mandatory declaration still provide flexibility about declaration order, allowing (for example) mutually recursive functions without redundant forward declarations.

1
votes

No, those changes do not turn your context-free grammar into a context-sensitive grammar.

You can tell the difference just by looking at the left-hand-sides of the productions: as long as every LHS is a single symbol, it's a CFG. If any LHS has multiple symbols, then it's a CSG.

(That's a bit of an oversimplification, but it's accurate enough to answer your question.)

That's grammars, but when it comes to languages, note that the language your grammar generates is different from the language of type-valid programs. The former is context-free, but the latter probably isn't.