1
votes

I'm reading finite automata & grammars from the compiler construction of Aho and I'm stuck with this grammar for so long. I don't have a clear perception of how I can describe it:

Consider the following Grammar:

S -> (L) | a L -> L, S | S

Note that the parentheses and comma are actually terminals in this language and appear in the sentences accepted by this grammar. Try to describe the language generated by this grammar. Is this grammar ambiguous?

My concern here is: Can language generated by this grammar be described as regular expressions? I'm confused about how to do it. Any help?

2

2 Answers

6
votes

To show that the grammar is ambiguous, you need to be able to construct two different parse trees while parsing the same string. Your string will be comprised of "(", ")", ",", and "a", since those are the only terminal symbols in the grammar.

Try arranging those 4 terminal symbols in a few ways and see if you can show different successful parses, in the spirit of the example ambiguous grammar on Wikipedia.

Immediate left recursion tends to cause problems for some parsers. See if "a,a,a" does anything interesting on "L → L , S | S"...

my concern here is language generated by this grammar as regular expression can it be described...i'am confused about how to do

A regular expression can not fully describe the grammar. Rewriting part of the grammar will make this more apparent:

  1. S → ( L )
  2. S → a
  3. L → L , S
  4. L → S

Pay attention to #1 and #4. L can produce S, and S can produce ( L ). This means S can produce ( S ), which can produce ( ( S ) ), ( ( ( S ) ) ), etc. ad infinitum. The key thing is that those parentheses are matched; there are the same amount of "(" symbols as ")" symbols.

A regex can't do that.

Regular expressions map to finite automata. Finite automata can not count. A language L ∈ {w: 0n 1n} is not a regular. L ∈ {w: (n )n}, just being a substiution of "(" for "0" and ")" for "1", isn't either. See: the first examples section under Regular Languages - Wikipedia. (Notation note: s1 is s, s2 is ss, ..., sn is s repeated n times.)

This means you can't use a regex to describe that part of the language. That puts it in the domain of CFGs, Turing Machines, and pushdown automata.

3
votes

Regular expressions (and a library to interpret them) are a poor tool for recognizing sentences of a context-free grammar. Instead, you would want to use a parser generator like yacc, bison, or ANTLR.

I think the point of the exercise in Aho's book is to "describe the language" in words, in order to understand whether it is ambiguous. One way to approach it: can you devise a grammatical sentence that can be parsed in two different ways, given the productions of the grammar? If so, the grammar is ambiguous.