0
votes

I'm writing an antlr4 grammar for a filter query parser (similar to Github issue query). In theory queries should look like:

tag:abc AND user:john

however, I'm having a problem to parse things like:

tag:tag

Where the first tag should mean that we're filtering by tags and the second is just a text.

My grammar looks like this:

parse
 : expression? EOF
 ;

expression
 : expression  operator  expression
 | WHITESPACE* selector WHITESPACE*
 ;

selector:
    SELECTOR ':' value=TEXT;

operator
 : AND | OR | WHITESPACE
 ;

SELECTOR     :  TAG | USER;
AND          : 'AND';
OR           : 'OR';
TAG          : 'tag' ;
USER         : 'user' ;
WHITESPACE   : (' ' | '\t') ;
TEXT_CHAR    : ~[ :];
TEXT         : TEXT_CHAR+;
2

2 Answers

0
votes

Keep TAG and USER separate:

selector: key=TAG  COLON value=TEXT
        | key=USER COLON value=TEXT
        ;

Whenever the selector rule matches, the corresponding SelectorContext#key field will contain the TAG or USER token actually matched.

0
votes

You can use the following code instead of selector:

selector:
    SELECTOR ':' value=(TEXT | SELECTOR);