I have an ANTLR v4 grammar in a .NET application. An object can either be an array or a String. An array is a list of zero or more objects enclosed in square brackets. A String is a sequence of characters enclosed in parentheses. A String may contain unescaped balanced parentheses, but it should not contain any unbalanced left or right parentheses; they can be included using the escape sequence \(
or \)
. As \
would be used to introduce the escape sequence, it would then also need to be escaped as \\
.
I have tried to code the grammar in such as way that balanced parentheses are simply recursive Strings within Strings, with a base case that disallows parentheses except in an escape sequence.
grammar Sample ;
root
: 'BT' object+ 'ET' EOF
;
object
: array
| String
;
array
: '[' object* ']'
;
String
: '(' ( StringCharacter | String )* ')'
;
fragment StringCharacter
: EscapeSequence
| ~[()\\]
;
fragment EscapeSequence
: '\\('
| '\\)'
| '\\'
;
Whitespace : [ \t\r\n] -> skip ;
The grammar above works for some values
BT [] ET
BT () ET
BT (\)) ET
BT () () ET
BT (one) (two) ET
BT [(one) (two)] ET
BT (one) [(two)] ET
BT (\() [(two)] ET
BT () [(\))] ET
BT (\)) (\)) ET
but it fails for this one
BT (\() [(\))] ET
In this case, I am trying to encode a String with a single escaped left parenthesis then an array with a single element that's a String with a single escaped right parenthesis.
The error message states:
line: 1:13 extraneous input ']' expecting {'ET', '[', String}
How should I change the grammar to achieve my goal?