I'm working with Antlr4 to parse a boolean-like DSL.
Here is my grammar:
grammar filter;
filter: overall EOF;
overall
: LPAREN overall RPAREN
| category
;
category
: expression # InferenceCategory
| category AND category # CategoryAndBlock
| label COLON expression # CategoryBlock
| LPAREN category RPAREN # NestedCategory
;
expression
: NOT expression # NotExpr
| expression AND expression # AndExpr
| expression OR expression # OrExpr
| atom # AtomExpr
| LPAREN expression RPAREN # NestedExpression
;
label
: ALPHANUM
;
atom
: ALPHANUM
;
Here is an example input string to parse:
(cat1:(1 OR 2) AND cat2:( 4 ))
This grammar works fine with this input; it produces the following parse tree which perfectly suits my needs:
However, there is weird case of the DSL, where the "cat1" label is implicit when no other category is specified. This is what the InferenceCategory tag catches, where this expression will be handled as a category in my code later.
For example, with
((1 OR 2) AND cat2:( 4 ))
I get (as expected):
However, in the following instance:
cat2:( 4 ) AND (1 OR 2)
I get:
Notice that the second block is not identified as a InferenceCategory and but instead as a normal expression, under the first category. This is because there the grammar parses ( 4 ) following cat2: as a normal expression, and everything past that is parsed as a normal expression.
Is there any way to fix this? I've tried:
label COLON expression (AND category)* # CategoryBlock
(which doesn't work)
and
category AND category AND category
(which "works", but is extremely hacky and only works in the specific case that I have exactly three categories. Any more, and it breaks again.)