3
votes

I fundamentally don't understand how antlr works. Using the following grammar:

blockcomment    :   '/\*' ANYCHARS '\*/';

ANYCHARS        :   ('a'..'z' | '\n' | 'r' | ' ' | '0'..'9')*  ;

I get a warning message when I compile the grammar file that says:

"non-fragment lexer rule 'ANYCHARS' can match the empty string"

Fine. I want it to be able to match empty strings as: "/\*\*/" is perfectly valid. But when I run "/\*\*/" in the TestRig I get:

missing ANYCHARS at '*/'

Obviously I could just change it so that '/**/' is handled as a special case:

blockcomment    :   '/\*' ANYCHARS '\*/' | '/**/';

But that doesn't really address the underlying issue. Can someone please explain to me what I am doing wrong? How can ANTLR raise a warning about matching empty strings and then not match them at the same time?

2

2 Answers

1
votes

add "fragment" to ANYCHARS? It will then do what you want.

1
votes
"non-fragment lexer rule 'ANYCHARS' can match the empty string"

The error message hints you to make ANYCHARS fragment. Empty string cannot be matched as a token, that would end up with infinitely many empty tokens anywhere in the source.

You want to make the ANYCHARS part of the BLOCKCOMMENT token, rather than a separate token. That is basically what fragments are good for - they simplify the lexer rules, but don't produce tokens.

BLOCKCOMMENT : '/*' ANYCHARS '*/';
fragment ANYCHARS : ('a'..'z' | '\n' | 'r' | ' ' | '0'..'9')* ;

EDIT: switched parser rule blockcomment to lexer rule BLOCKCOMMENT to enable fragment usage