Left-factoring a grammar

Question

So i have this grammar (below) and i need to build a parse table. I need to make this suitable for a predictive parser. I know the first think is to make it unambiguous, but for me it's already unambiguous (since i can't find a string for which i can draw 2 different parse trees). Second i need to make it left factored. I put my guess below the original grammar, i sens i'm missing something can someone point out if i'm missing something.

S -> m G | m K p
G -> n G | n
K -> q K r | m n

My guess:

S -> m A
A -> G | K p 
G -> n G'
G' -> n G' | emptyString
K -> q K r | m n

Do I misunderstand or do you have an extra m in there? S -> mQ and Q -> mA will give you an extra m, right? — Hassan
correct my mistake, ill update. thanks for pointing that out — Bobby
i tried to find the first and follow for non-terminal, and i got $ for all follows except for K, I thought maybe something is wrong since i haven't played with grammars that much. — Bobby
It's obvious by inspection that no non-terminal other than K can be followed by anything other than $. — rici

templatetypedef templatetypedef · Accepted Answer · 2018-12-16T04:28:55

What you have looks correct! Here's a step-by-step way to see how to get there, along with why each transformation is necessary.

First, let's look at our S nonterminal. This nonterminal has two productions that start with m, meaning that we have a FIRST/FIRST conflict between those productions. Left-factoring the productions S → mG and S → mKp gives us

S → mA

A → G

A → Kp

Now, did doing this expose any problems that previously weren't there? Fortunately, no. The nonterminal G can only produce strings starting with n, and the nonterminal K can only produce string starting with q or m. That means that we didn't introduce any FIRST/FIRST conflicts here, so there's no need to touch anything further - at least, not yet.

Next, let's look at the G nonterminal, which has the productions G → nG and G → n. In other words, this produces a string of one or more copies of the letter n. As it's currently written, there's a FIRST/FIRST conflict. There are many ways we can rewrite this. The one you proposed was essentially to split this into two pieces - one piece that generates a single n, and a follow-up piece that generates zero or more copies of n. I'll follow your lead here in doing that, which introduces a new nonterminal I'm going to call H to differentiate it from G:

G → nH

H → nH | ε

Now, we have to ask - does this ε production introduce any FIRST/FOLLOW conflicts? To answer that, we're going to need to determine what FOLLOW(H) is. We see that H only appears at the end of the productions H → nH (which doesn't give us anything new) and G → nH, which tells us that everything in FOLLOW(G) is going to also be in FOLLOW(H). And what's in FOLLOW(G)? G appears at the end of the production A → G, which tells us everything in FOLLOW(A) is going to be in FOLLOW(H). And A only appears in S → mA, meaning that the only token in FOLLOW(A) is the end-of-input marker $. Phew! So FOLLOW(H) = { $ }. And that's good news, because that doesn't conflict with the H → nH production.

That leaves behind the production rules for K, which fortunately don't have any issues with them.

Putting this all together, we get that the net transformed grammar is

S &rarr mA

A → G | Kp

G → nH

H → nH | ε

K → qKr | mn

This happens to be LL(1). Here's the parse table:

     m      n       q     r      $
  +------+------+------+------+------+
S |  mA  |      |      |      |      |
  +------+------+------+------+------+
A |  Kp     G      Kp  |      |      |
  +------+------+------+------+------+
G |      |  nH  |      |      |      |
  +------+------+------+------+------+
H |      |  nH  |      |      | eps  |
  +------+------+------+------+------+
K |  mn  |      | qKr  |      |      |
  +------+------+------+------+------+

Look ma! No conflicts!

Left-factoring a grammar

1 Answers