Here is a working subset of my C parsing grammar. It can only parse the input shown below but is enough to illustrate the problem my full grammar encountered. Note it is following the traditional method to define operator precedence:
grammar CPPProcessor;
translation_unit: expression;
primary_expression:
'1'
//| {false}? '(' expression ')'
| 'a'
| 'b'
;
postfix_expression:
primary_expression
| postfix_expression '(' expression ')'
;
unary_expression:
postfix_expression
| '-' cast_expression
;
cast_expression:
unary_expression
| '(' 'a' ')' cast_expression
;
additive_expression:
cast_expression
| additive_expression '-' cast_expression
;
expression : additive_expression;
WS: [ \t\f]+ -> channel(1);
CRLF: '\r'? '\n' -> channel(1);
Invocation rule is translation_unit and the input is a single line containing this:
(a)-b
Notice the semantic predicate in primary_expression has been commented out.
(The way to interpret the grammar is that when the second rule of primary_expression is enabled the input is parsed as a subtraction. When the subrule is not there, it becomes a C-style type cast of -b to type a).
Problem: The real issue is that I suppose having a {false}? is equivalent to having nothing, hence removing the comment should have no difference. However, the parse failed when I removed the comment i.e.
primary_expression:
'1'
| {false}? '(' expression ')'
| 'a'
| 'b'
;
and got this error:
line 1:0 no viable alternative at input '('
Why having a {false}? semantic predicate can cause parse failure? Could it be a bug in ANLTR4? It looks like the second subrule in postfix_expression is causing the issue which is left-recursive. When the left-recursion is removed, the issue disappears