10
votes

Can someone much brighter than I succinctly describe to the SO community the NFA to DFA conversion algorithm? (Preferably in 500 words or less.) I've seen diagrams and lectures that have only served to confuse what I thought I once knew. I'm mostly confident in generating an initial NFA transition table from a state diagram, but after that, I lose the DFA in the epsilons and subsets.

1) In a transition (delta) table, which column represents the new DFA states? Is it the first column of generated states?

2) In row {2,3}, col 0 of my example below, what does the {2,3} mean about the NFA in terms of its state diagram? (Sorry, I must think in pictures.) And I assume it will be a "loop-back on input 0" in the DFA?

3) Any simple "rules of thumb" on getting from the table to the DFA or on recognizing the accept states of the resulting DFA?

Finitely Autonomous

delta  |  0    |  1     |
=======+=======+========+
{1}    |{1}    |{2}     |
{2}    |{3}    |{2,3}   |
{3}    |{2}    |{2,4}   |
{2,3}  |{2,3}  |{2,3,4} |
{2,4}  |{3,4}  |{2,3,4} |
{2,3,4}|{2,3,4}|{2,3,4} |
{3,4}  |{2,4}  |{2,4}   |

Edit: Here is the above table in dot format, cheers Regexident.

digraph dfa {
    rankdir = LR;
    size = "8,5"
/*  node [shape = doublecircle]; "1";*/
    node [shape = circle];

    "1" -> "1" [ label = "0" ];
    "1" -> "2" [ label = "1" ];

    "2" -> "3"   [ label = "0" ];
    "2" -> "2_3" [ label = "1" ];

    "3" -> "2"   [ label = "0" ];
    "3" -> "2_4" [ label = "1" ];

    "2_3" -> "2_3"   [ label = "0" ];
    "2_3" -> "2_3_4" [ label = "1" ];

    "2_4" -> "2_3"   [ label = "0" ];
    "2_4" -> "2_3_4" [ label = "1" ];

    "2_3_4" -> "2_3_4" [ label = "0" ];
    "2_3_4" -> "2_3_4" [ label = "1" ];
    "3_4" -> "2_4" [ label = "0" ];
    "3_4" -> "2_4" [ label = "1" ];
}

And here in rendered form:

Rendered dot graph

Note: The table lacks any information regarding state acceptance, hence so does the graph.

3
00ps, I forgot my example. Lemme punch it in real quick...Old McStopher

3 Answers

12
votes

When you construct a DFA from an NFA you basically find those sets of states that the NFA can be in a time (like simulating the NFA). First you begin with the start state and then find all states that can be reached through epsilon transitions. This set of states form the start state of the resulting DFA. Then you follow the outgoing transitions from this state set. Those may lead to another NFA state, for that you find the states reachable through epsilon inputs again, and you will get another set of NFA states that will be a new DFA state. You do this process until you have finished.

The point is, that the resulting DFA states will become sets of the old NFA states, that are compatible (regarding to epsilon transitions). These sets may also overlap, that is no error. And now I can answer your questions:

1) The first column represents the new DFA states. It shows the NFA state sets that form the given DFA state.

2) Your assumption is correct, it means loopback to state {2,3} on 0 input.

3) The DFA table can be easily constructed from this table, just name your states with letters. Any DFA states that contain at least one NFA accept state will become accept states in the DFA, too.

I hope I was clear enough.

4
votes

The core idea Is probably to understand that the DFA is a sort of machine that is superimposed over the NFA. While the NFA is simpler in terms of the number of nodes, or its relationship with the problem, it's rules are quite subtle and compled (it goes into the right state, whichever that might be). The dfa is much more complex in terms of the nodes it contains, but has really simple rules, since there is exactly one output state for any given input.

The transformation is pretty straitforward. each state in the DFA is a subset of the states in the NFA. The start state of the DFA is the set containing only the start state of the NFA, and the accept state for the DFA is all of its states which have the accept state of the NFA as elements.

The transitions in the DFA are the only tricky party. an NFA is non-deterministic because its output states for a given input is a set of states, but the DFA has sets of the corresponding NFA states as its own states, representing which of the NFA states the automaton could be in. So the output state of any DFA state for any given input is the union of output states of all of the NFA states of that DFA state.

In terms of actual implementations, the DFA has a state population that is essentially the powerset of the NFA's states. IE, 2^(n) for n NFA states. In practice it's usually much smaller, but there's no general way to predict it's size, so some practical NFA to DFA implementations generate the DFA states dynamically as they are reached and cache them. Once a certain number of states are created, the least recently used is discarded, limiting the size of the cache.

0
votes

Suppose the input NFA is (S, Q, q0, T, F) where S is the set of input symbols, Q is the set of states, q0 is the start state, T is a set of transitions, and F is the set of accepting states. Each transition is a triple (q, s, r) where q and r are states and s is a string of length 0 or 1.

For any finite string s, let D(s) be the set of all states that can be reached from q0 following a path of transitions whose labels catenate together to make s.

What the algorithm needs to do is to construct a deterministic automaton whose states are subsets of Q and such that for any string s, the DFA will end up in state D(s).

If that’s the case, then any DFA state that contains an accepting state of the NFA should be an accepting state of the DFA.

Consider the empty string, epsilon; D( “” ) will be the epsilon-closure of q0, i.e. all the states you can reach from q0, following only transitions labeled with the empty string. Let’s call that Set Q0. (In your example D(“”)={1}.) Now we “explore” that state, which means that: for each input symbol a, you figure out where the transition for that symbol, out of the state should go. That results in you discovering a bunch more states that need to be in the DFA. (In your example S={0,1}, so these states will be D(“0“)={1} and D(“1”)={2}. But D(“0”) is the same as D{“”); so it’s not new. So the only state now that has been discovered, but not explored is D(“1”).) And just keep exploring states until there are no more states left that have been discovered but haven’t been explored.