2
votes

I wonder how to design this regular grammar, or how to convert my context free grammar to regular grammar (like A->aA).I tried but no result for this.

Question: The set of strings on Σ ={a,b}which contain at least two occurrences aaa, and at least one occurrence of bbbb.[aaaabbbb counts]. The grammar should be regular.

My answer in CFG is A will check if 'aaaa' occur in the word or not B will check if 'bbb' occur in the word or not C will check if 'aaa' occur at least twice in the word or not

S -> AB | BA | CBC | CCB | BCC

A -> aaaa | aA | Aa | bA | Ab

B -> bbb | aB | Ba | bB | Bb

C -> aaa | aC | Ca | bC | Cb

1
You could simplify that CFG a lot by defining a non-terminal which is "any sequence of a and b" instead of adding those sequence to the beginning and end of A, B and C. That should make it a lot easier to see how to form the regular expression.rici

1 Answers

2
votes

To get a regular grammar, you can try to write down a DFA first; converting a DFA to a regular grammar is trivial. To get a DFA, we can use the Cartesian Product Machine construction here.

Start with DFAs for the languages L1 containing aaaa or two instances of aaa, and L2 containing bbbb. The DFAs for these are straightforward:

        b                          a,b
  /--+------+-------\             /--\
  \  |      |       |             \  |
   \ V      |       |              \ V
L1: q0--a-->q1--a-->q2--a-->q3--a-->q4
                           / ^
    /-+-----+---b---+----/   |
    | |     |                |
    V /     |                |
    q5--a-->q6--------a------/


                a                  a,b
  /--+------+-------+-------\     /--\
  \  |      |       |       |     \  |
   \ V      |       |       |      \ V
L2: q0--b-->q1--b-->q2--b-->q3--b-->q4

The Cartesian Product Machine construction is going to give us a 35-state DFA: eight states in the first one times five states in the second one. We'll call these states q00, q01, …, q64. Then the regular grammar is just a different way of writing the transitions. Here's what it ends up looking like:

q00 -> aq10 | bq01
q01 -> aq10 | bq02
q02 -> aq10 | bq03
q03 -> aq10 | bq04
q04 -> aq14 | bq04
q10 -> aq20 | bq01
q11 -> aq20 | bq02
q12 -> aq20 | bq03
q13 -> aq20 | bq04
q14 -> aq24 | bq04
q20 -> aq30 | bq01
q21 -> aq30 | bq02
q22 -> aq30 | bq03
q23 -> aq30 | bq04
q24 -> aq34 | bq04
q30 -> aq40 | bq51
q31 -> aq40 | bq52
q32 -> aq40 | bq53
q33 -> aq40 | bq54
q34 -> aq44 | bq54
q40 -> aq40 | bq41
q41 -> aq40 | bq42
q42 -> aq40 | bq43
q43 -> aq40 | bq44
q44 -> aq44 | bq44
q50 -> aq60 | bq51
q51 -> aq60 | bq52
q52 -> aq60 | bq53
q53 -> aq60 | bq54
q54 -> aq64 | bq54
q60 -> aq30 | bq51
q61 -> aq30 | bq52
q62 -> aq30 | bq53
q63 -> aq30 | bq54
q64 -> aq34 | bq54

We notice some of the nonterminals never show up on the right side of a production. We can simplify the grammar by getting rid of these:

q00 -> aq10 | bq01
q01 -> aq10 | bq02
q02 -> aq10 | bq03
q03 -> aq10 | bq04
q04 -> aq14 | bq04
q10 -> aq20 | bq01
q14 -> aq24 | bq04
q20 -> aq30 | bq01
q24 -> aq34 | bq04
q30 -> aq40 | bq51
q34 -> aq44 | bq54
q40 -> aq40 | bq41
q41 -> aq40 | bq42
q42 -> aq40 | bq43
q43 -> aq40 | bq44
q44 -> aq44 | bq44
q50 -> aq60 | bq51
q51 -> aq60 | bq52
q52 -> aq60 | bq53
q53 -> aq60 | bq54
q54 -> aq64 | bq54
q60 -> aq30 | bq51
q64 -> aq34 | bq54

That should get you pretty close to where you want to be. At this point we just need to add some productions to encode the fact that q44 is the only accepting state. You could add q44 -> e if that's allowed, or just wherever you have q -> sq44 add an extra production of the form q -> s.