I have the following ANTLR4 grammar to interpret regular expressions.
// Regular Expression Grammar.
grammar RegExpr;
program : expr EOF # Root
;
expr : TERM # TermNode
| expr '?' # OptionalNode
| '(' expr ')' # OrdinaryNode
| expr expr # ConcatNode
| expr '|' expr # OrNode
;
ESC : '\\' . ;
TERM : ([a-zA-Z0-9,.*^+\-&'":><#![\]] | ESC)+ ;
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines
However, when I try to parse string literal '\\(' in Java, I got
line 1:0 no viable alternative at input '\('
I want to treat any character with '\\' prefix as terminals. For example, '\\(', '\\)', '\\\\', '\\X' are treated as terminals.
In the end, I want to parse '\(a.(b|c)\)' as
'\(a.' (b|c) '\)'
which represents '\(a.b\)' and '\(a.c\)'. Then I can remove all '\'s to get '(a.b)' and '(a.c)'.
Can anyone please point out why does the above grammar gives errors on '\\(' and '\(a.(b|c)\)'?
Thanks!