1
votes

From Mastering Regular Expressions 3e:

As a result, broadly speaking, there are three types of regex engines:

  • DFA (POSIX or not—similar either way)
  • Traditional NFA (most common: Perl, .NET, PHP, Java, Python, . . . )
  • POSIX NFA

From the Theory of Computation: Formal Languages, Automata, and Complexity:

For each NFA, there is a DFA that accepts exactly the same language.

Can I argue that NFA and DFA are the same thing? Or even though their ability to recognize patterns are equivalent, but they are still different in some way?

1
The fact that you can construct a DFA which accepts the same language does not mean they are "the same". They operate differently, that's how they are different.tripleee

1 Answers

1
votes

There are two things you're missing:

  1. The "traditional NFA" implementations actually include abilities beyond the strict computer science definition of an NFA.

  2. Performance characteristics are a thing to care about, even given two implementations that cover the same answer.

The net effect is that the backtracking implementations (I like that name better than "traditional NFA") are slightly more expressive than DFA implementations because they can match regexes like (\w{3,})\1, which matches three or more word characters repeated twice (something that can't be matched by a DFA). At the same time, DFA implementations are guaranteed O(n) in input length, but it is very easy to write regexes that have O(n^2) or worse behavior when presented with a string they don't match. (See https://swtch.com/~rsc/regexp/regexp1.html )