ANTLR4: Lexer returning a single token when in a lexer mode

Question

I am attempting to use a lexer mode with ANTLR4 with the following lexer grammar:

STRING: '"' -> pushMode(STRING_MODE);
mode STRING_MODE;
STRING_CONTENTS: ~('"'|'\n'|'\r')+ -> type(STRING);
END_STRING: '"' -> type(STRING), popMode;
STRING_UNMATCHED: . -> type(UNMATCHED);

Is there a way to return a single token of type STRING for all the characters captured within the mode and including the characters which caused an entrance to the mode?
When does the mode end?

I am aware that I can also write the string token like so:

STRING: '"' (~["\n\r]|'\\"')* '"';

GRosenberg GRosenberg · Accepted Answer · 2018-09-12T23:02:33

1) The more attribute will accumulate the matched text into the first token emitted by a non-more attributed rule.

For:

STRING: '"' -> more, pushMode(STRING_MODE);

mode STRING_MODE;
    STRING_CONTENTS: ~('"'|'\n'|'\r')+ -> more ;
    END_STRING: '"' -> type(STRING), popMode;

the text matching the STRING and STRING_CONTENTS rules is prepended to that of the END_STRING rule, resulting in a STRING-typed token containing the full text of the string.

2) The 'end' of a mode statement is implied by the first subsequent encounter of

a parser rule
another mode statement
a fragment rule
EOF

ANTLR4: Lexer returning a single token when in a lexer mode

1 Answers