OK, I've tried everything before coming here to ask, but this is driving me crazy.
I'm creating a simple language for querying documents in a custom NoSQL database. A sample query looks like this:
VALUE("price: " SUM($price) " Average: " AVG($price)).MATCH($price > 5 OR $price < 100 OR $cost > 30)
It is something in the middle between SQL and MONGODB's aggregation queries (the parameter in VALUE concatenates the strings and the aggregations, in match there is a boolean match).
The problem is, when I parse this, I'm getting a line 1:69 extraneous input ' ' expecting COMPARATOR followed by a line 1:69 no viable alternative at input ' '. This is the same for rows 80-82 and 95-97.
As you can see, the problem is surrounding the comparators ('<', '>', etc). I've been looking my grammar for conflicts or ambiguities without any luck (admittedly, I just got into ANTLR very recently).
Here's my grammar:
// Define a grammar called Capsa
grammar Capsa;
eval : VARIABLE | function;
function : functionValue;
functionValue : 'VALUE(' (STRING ' ')* functionNumber (' '(STRING|functionNumber))* ')' (match)?;
match : '.MATCH(' booleanexpression ')';
functionNumber: FUNCTIONNUMBERTYPE'(' value ')';
FUNCTIONNUMBERTYPE: 'SUM'|'AVG'|'MAX'|'MIN'|'FIRST'|'LAST' ;
value
: VARIABLE #Var
| REALNUMBER #Literal
| STRING #Literal
| calcexpression #Calc
| booleanValue #Literal;
/*
** Boolean stuff
*/
AND : '&&' | ' AND ';
OR : '||' | ' OR ';
NOT : '!' | ' NOT ';
booleanexpression : '(' booleanexpression ')' #BooleanParentExpression
| booleanexpression AND booleanexpression #AndExpression
| booleanexpression OR booleanexpression #OrExpression
| NOT booleanexpression #NotExpression
| (value COMPARATOR value) #Comparison
| booleanValue #ComparisonLogic;
booleanValue
: 'true'
| 'false';
/*
** Comparators
*/
fragment GT : '>';
fragment GTE : '>=';
fragment LT : '<';
fragment LTE : '<=';
fragment EQ : '=';
fragment EX : ':' | '==';
COMPARATOR : GT | GTE | LT | LTE | EQ | EX;
/*
** End Comparators
*/
/*
** End Boolean stuff
*/
/*
** Calc
*/
calcexpression
: '(' calcexpression ')' #CalcParentExpression
| calcexpression ('*'|'/') calcexpression #MultOrDiv
| calcexpression ('+'|'-') calcexpression #AddOrSub
| VARIABLE #CalcID
| REALNUMBER #CalcNumber;
/*
** End Calc
*/
fragment ID : [a-zA-Z_][a-zA-Z0-9_]+ ;
VARIABLE : '$'ID;
STRING : '"' (ESC | ~["\\])* '"' ;
fragment CONSTANT : STRING | REALNUMBER;
fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX : [0-9a-fA-F] ;
fragment INT : [0-9]+ ; // no leading zeros
fragment EXP : [Ee] [+\-]? INT ; // \- since - means "range" inside [...]
REALNUMBER
: '-'? INT '.' INT EXP? // 1.35, 1.35E-9, 0.3, -4.5
| '-'? INT EXP // 1e10 -3e4
| '-'? INT // -3, 45
;
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines
The only solution I've found so far is to change the line that says:
| (value COMPARATOR value) #Comparison
for:
| (value ' '* COMPARATOR ' '* value) #Comparison
But looks more like a hack than a solution for me...
What am I missing? I'm pretty sure it will be something quite dumb... but I've spent the whole day on this without luck...
Bonus track:
(this one is not as important) I'm also trying to allow calc expressions in the boolean queries (like 5+3 > 6 or $variable+10 < 100), but in this case, breaks completely expecting a comparator ('>', '<', ...), when the operator ('+', '-', ...) is present.