3
votes

I have the following grammar:

S -> S+S|SS|S*|(S)|a

How do I convert it into a LL(1) grammar?

I tried to eliminate left recursion so I got

S->(S)S'|aS'
S'->+SS'|SS'|*S'|epsilon

I also tried to do left factoring first and then eliminate left recursion and I got something like :

S->(S)S"|aS"
       S"->S'S"|epsilon
       S'->+S|*|S

But I still do not get a perfect answer. I feel grammar is still not LL(1). Please help.

1
This question appears to be theoretical rather than practical. (The grammar does not appear to be realistic.) Try cs.stackoverflow.com - Raymond Chen
@RaymondChen: It's a (simplified) grammar for regular expressions using + for the alternation operator usually written as |. Why is that not realistic? - rici
@rici Most languages have more terminals than a. If you want a regular expression parser, then just use an existing one - no need to reinvent the wheel. - Raymond Chen
I might be missing something, but how is the second grammar not already LL(1)? - user824425

1 Answers

1
votes

It might help to try writing the grammar so that you read some complete term, then optionally try extending it in some way. For example, you might try something like this:

S → Term

Term → CoreTerm OptMore

CoreTerm → a | (Term)

OptMore → ε | Term | + Term | * OptMore

For example, you'd derive a(a+a)*a as

S

⇒ Term

⇒ CoreTerm OptMore

⇒ a OptMore

⇒ a Term

⇒ a CoreTerm OptMore

⇒ a(CoreTerm OptMore) OptMore

⇒ a(a OptMore) OptMore

⇒ a(a + Term) OptMore

⇒ a(a + CoreTerm OptMore) OptMore

⇒ a(a + a OptMore) OptMore

⇒ a(a + a) OptMore

⇒ a(a + a)* OptMore

⇒ a(a + a)* Term

⇒ a(a + a)* CoreTerm OptMore

⇒ a(a + a)* a OptMore

⇒ a(a + a)* a

To see that this is an LL(1) grammar, here's the FIRST sets:

  • FIRST(S) = {
  • FIRST(Term) = { a, ( }
  • FIRST(CoreTerm) = { a, ( }
  • FIRST(OptMore) = { ε, a, (, +, * }

Here's the FOLLOW sets:

  • FOLLOW(S) = {$}
  • FOLLOW(Term) = {$, )}
  • FOLLOW(CoreTerm) = {a, (, +, *, $}
  • FOLLOW(OptMore) = {$, )}

So now we can fill in the parse table:

         | a                | (                | +      | *         | )   | $
---------+------------------+------------------+--------+-----------+-----+------
S        | Term             | Term             |        |           |     |
Term     | CoreTerm OptMore | CoreTerm OptMore |        |           |     |
CoreTerm | a                | (Term)           |        |           |     |
OptMore  | Term             | Term             | + Term | * OptMore | eps | eps

So this grammar is indeed LL(1).