When a rule matches in antlr4, and you get the text of that rule, the whitespace is commonly stripped out by the lexer with
WS: [ \n\t\r]+ -> skip;
Is it possible to ask in a parse tree visitor "Did this rule skip over any whitespace?"
E.g.
WS: [ \n\t\r]+ -> skip;
ALPHA: [a-z];
NUMERIC: [0-9];
myrule: (ALPHA | NUMERIC)+;
Then in the visitor (I'm using C++):
antlrcpp::Any MyVisitor::visitMyrule(dlParser::MyruleContext *ctx) {
if (ctx->didSkipSomeWhitespace()) {
/* There was whitespace */
} else {
/* There was no whitespace */
}
return false;
}
So:
f56fhj => no whitespace
o9f g66ff o => whitespace
I've tried getting the start/stop indices of the token so that I can compare the text length against the number of characters that went into it, but the stop token is not always available, and if it is then the values don't line up with the indices that I expect, and it does not appear to be simple to access the original input characters that formed the token.