6
votes

Are there any good CSS grammars out there for antlr4? I know there are some grammars for antlr3, but it turns out CSS is not trivial to parse without "lexer modes", which were added in v4. Why?

Consider the following CSS selectors:

.hello.world { /* ... */ }
.hello .world { /* ... */ }

In most grammars, whitespace is simply ignored. But if you ignore whitespace, it becomes impossible to distinguish between the two selectors above at the parser level.

Then again, if you don't ignore whitespace, the grammar becomes pretty noisy with WS? or WS* patterns everywhere, since whitespace is mostly meaningless unless it occurs within a selector.

Which is where modes from antlr4 come in, because with support for lexer modes you can define new rules for the lexer whenever you enter different contexts (i.e. don't ignore whitespace within the "selector" context).

That said, I'll accept any grammar for antlr3 as well so long as it handles whitespaces properly, as that's the version we're using now anyway ;-)

1
We ended up just using phloc-css, an excellent open source CSS parser library. Great support for the latest CSS specification, and very actively developed.gzak
Maybe helpful antlr2.org/article/whitespace (i.e. just tokenize it like everything else, charVocabulary is not needed since around 3.3)Darryl Miles

1 Answers

3
votes

I imagine you've already found an answer to this question as it was posted so long ago. Even still:

A good Antlr v4 grammar for CSS3 can be found on the Antlr Github here:

Antlr v4 CSS3 Grammar

As stated in the Readme, it works on a number of CSS files - however, it doesn't handle the full syntax of @import or @include.