
I have a lexer rule that defines single-quoted literal string as

L_S_STRING  : '\'' (('\'' '\'') | ('\\' '\'') | ~('\''))* '\''

It fails one particular case:


The problem is really with the last two single quotes. If I added a space in between, it worked. Or I could use two single quotes to end and it worked too, e.g.


I am not sure if it has something to do with having a non-greedy operator which caused the first-match of ('\'' '\'')? If so, I don't see how the last version could have worked.

In any event, could someone help please?

UPDATE - I am not able to reproduce it outside of the full grammar. This may be a red herring.

UPDATE - I missed some important context so I posted another question here Antlr4: single quote rule fails when there are escape chars plus carriage return, new line

Can you please tell more about your syntax? How the characters are escaped and the meaning of two single quotes, which strings are valid and which are not.trollingchar
Please add a MCVE that demonstrates what you describe: stackoverflow.com/help/mcveBart Kiers

1 Answers


I can't reproduce that. Given the following grammar:

lexer grammar Test;

L_S_STRING  : '\'' (('\'' '\'') | ('\\' '\'') | ~('\''))* '\'';
OTHER       : . ;

which can be tested as follows:

String source = "A'yyyy-MM-dd\\\\'T\\\\'HH:mm:ss\\\\'Z\\\\''B";

Test lexer = new Test(CharStreams.fromString(source));
CommonTokenStream tokens = new CommonTokenStream(lexer);

for (Token t : tokens.getTokens()) {
    System.out.printf("%-15s %s\n", Test.VOCABULARY.getSymbolicName(t.getType()), t.getText());

will print:

OTHER           A
L_S_STRING      'yyyy-MM-dd\\'T\\'HH:mm:ss\\'Z\\''
OTHER           B
EOF             <EOF>