0
votes

I'm running into a situation where a keyword from my grammar is used in the input script where the user can essentially type anything (e.g. a variable name). But ANTLR doesn't like this when it parses the script.

I know most languages have a set of reserved keywords that are pretty much forbidden in the source code because they get in the way of parsing.

But I thought that my grammar rules are clear enough that ANTLR wouldn't get confused.

Here's a simplified version of the grammar:

grammar test;

script : statements EOF ;

statements : statement* ;

statement : (output_statement | variable_statement) ;

output_statement : identifier ('format' column_format) ;

column_format : STRING_LITERAL;

variable_statement : identifier '=' STRING_LITERAL ;

identifier : IDENTIFIER ;

IDENTIFIER : [a-z]+ ;

STRING_LITERAL : '"' ( ~[\\\r\n"] )* '"' ;

WS : [ \t\r\n\u000C]+ -> channel(HIDDEN) ;

The following parses ok:

x = "a"
x format "str"

But this next input text does not parse:

format = "a"
format format "str"

test::script:1:0: mismatched input 'format' expecting EOF

Is there any way to structure my grammar so "format" is permitted as an identifier?

Thanks.

1

1 Answers

1
votes

Since format is both a keyword and an identifier:

output_statement : identifier (FORMAT column_format) ;
.....
identifier : IDENTIFIER | FORMAT ;
.....
FORMAT     : 'format' ;
IDENTIFIER : [a-z]+ ;
.....