We all know by now that parsing HTML using regular expressions is not possible in general, since it'd be parsing a context-sensitive grammar while regular expressions can only parse regular grammars. The same is certainly true for other programming languages.
Now, recently, Rainbow.js syntax highlighter has been announced. Its premise is described as very simple:
Rainbow on its own is very simple. It goes through code blocks, processes regex patterns, and wraps matching patterns in tags.
I figured syntax highlighting is essentially a task of the same complexity as language parsing, if we assume it has to be both good and suitable for many languages. Still, while there is quite a bit of criticism of that library, neither that nor the HackerNews discussion (taken as an example for a discussion by technically-inclined) have mentioned that highlighting syntax using regular expressions is basically impossible in a general case, which I'd consider a major, show-stopping flaw.
Now the question is: is there something I'm missing? In particular:
- Is syntax highlighting with regular expressions possible in general?
- Is this an instance of an applied 80/20 rule, where just enough is possible with regular expressions to be useful?