0
votes

I've got a grammar in ANTLR and don't understand how it can be recursive. Is there any way to get ANTLR to show the derivation that it used to see that my rules are recursive?

The recursive grammar in it's entirety:

grammar DeadMG;

options {
    language = C;
}

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

INT :   '0'..'9'+
    ;

FLOAT
    :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

CHAR:  '\'' ( ESC_SEQ | ~('\''|'\\') ) '\''
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

program 
    : namespace_scope_definitions;

namespace_scope_definitions
    : (namespace_definition | type_definition | function_definition | variable_definition)+;

type_scope_definitions
    : (type_definition | function_definition | variable_definition)*;   

namespace_definition
    : 'namespace' ID '{' namespace_scope_definitions '}';

type_definition 
    : 'type' ID? (':' expression (',' expression)+ )? '{' type_scope_definitions '}';

function_definition
    : ID '(' function_argument_list ')' ('(' function_argument_list ')')? ('->' expression)? compound_statement;

function_argument_list
    : expression? ID (':=' expression)? (',' function_argument_list)?;

variable_definition
    : 'static'? expression? ID ':=' expression 
    | 'static'? expression ID ('(' (expression)* ')')?;

literal_expression
    : CHAR
    | FLOAT
    | INT
    | STRING
    | 'auto'
    | 'type'
    | type_definition;

primary_expression
    : literal_expression
    | ID
    | '(' expression ')';

expression 
    : assignment_expression;

assignment_expression
    : logical_or_expression (('=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<='| '>>=' | '&=' | '^=' | '|=') assignment_expression)*;

logical_or_expression
    : logical_and_expression ('||' logical_and_expression)*;

logical_and_expression
    : inclusive_or_expression ('&&' inclusive_or_expression)*;

inclusive_or_expression
    : exclusive_or_expression ('|' exclusive_or_expression)*;

exclusive_or_expression
    : and_expression ('^' and_expression)*;

and_expression
    : equality_expression ('&' equality_expression)*;

equality_expression
    : relational_expression (('=='|'!=') relational_expression)*;

relational_expression
    : shift_expression (('<'|'>'|'<='|'>=') shift_expression)*;

shift_expression
    : additive_expression (('<<'|'>>') additive_expression)*;

additive_expression
    : multiplicative_expression (('+' multiplicative_expression) | ('-' multiplicative_expression))*;

multiplicative_expression
    : unary_expression (('*' | '/' | '%') unary_expression)*;

unary_expression
    : '++' primary_expression
    | '--' primary_expression
    | ('&' | '*' | '+' | '-' | '~' | '!') primary_expression
    | 'sizeof' primary_expression
    | postfix_expression;

postfix_expression
    : primary_expression
    | '[' expression ']'
    | '(' expression* ')'
    | '.' ID
    | '->' ID
    | '++' 
    | '--';

initializer_statement
    : expression ';'
    | variable_definition ';';

return_statement
    : 'return' expression ';';

try_statement
    : 'try' compound_statement catch_statement;

catch_statement
    : 'catch' '(' ID ')' compound_statement catch_statement?
    | 'catch' '(' '...' ')' compound_statement;

for_statement
    : 'for' '(' initializer_statement expression? ';' expression? ')' compound_statement;

while_statement
    : 'while' '(' initializer_statement ')' compound_statement;

do_while_statement
    : 'do' compound_statement 'while' '(' expression ')';

switch_statement
    : 'switch' '(' expression ')' '{' case_statement '}';

case_statement
    : 'case:' (statement)* case_statement?
    | 'default:' (statement)*;

if_statement
    : 'if' '(' initializer_statement ')' compound_statement;

statement
    : compound_statement
    | return_statement
    | try_statement
    | initializer_statement
    | for_statement
    | while_statement
    | do_while_statement
    | switch_statement
    | if_statement;

compound_statement
    : '{' (statement)* '}';

More specifically, I am having trouble with the following rules:

namespace_scope_definitions
    : (namespace_definition | type_definition | function_definition | variable_definition)+;

type_scope_definitions
    : (type_definition | function_definition | variable_definition)*;   

ANTLR is saying that alternatives 2 and 4, that is, type_definition and variable_definition, are recursive. Here's variable_definition:

variable_definition
: 'static'? expression? ID ':=' expression 
| 'static'? expression ID ('(' (expression)* ')')?;

and here's type_definition:

type_definition 
: 'type' ID? (':' expression (',' expression)+ )? '{' type_scope_definitions '}';

'type' itself, and type_definition, is a valid expression in my expression syntax. However, removing it is not resolving the ambiguity, so it doesn't originate there. And I have plenty of other ambiguities I need to resolve- detailing all the warnings and errors would be quite too much, so I'd really like to see more details on how they are recursive from ANTLR itself.

2
ANTLRWorks may be able to help you find it. - user597225
@Adam12: I'm using ANTLRWorks and I'm still pretty damn sure that my rules are not recursive. - Puppy
@Bart Kiers: OK. That doesn't really change the question, though, which is that I'd like to see how ANTLR figures that my rules are recursive, because I don't see it. - Puppy
@Bart Kiers: That's what I did. I had everything working just fine until I had expressions. Now everything and it's dog is ambiguous :( ANTLR is telling me some quite insane things, like an integral literal or identifier is a valid variable definition, which it quite clearly cannot be. - Puppy
I'd advise against enabling backtracking. - user597225

2 Answers

1
votes

My suggestion is to remove most of the operator precedence rules for now:

expression 
    : multiplicative_expression
      (
        ('+' multiplicative_expression)
      | ('-' multiplicative_expression)
      )*;

and then inline the rules that have a single caller to isolate the ambiguities. Yes it is tedious.

-1
votes

I found a few ambiguities in the grammar, fixed them and got a lot less warnings. However, I think that probably, LL is just not the right parsing algorithm for me. I am writing a custom parser and lexer. It would still have been nice if ANTLR would show me how it found the problems though, so that I might intervene and fix them.