The grammar below works incorrectly.
The grammar is following:
program:
(keyword |
string |
WS)*;
keyword: 'print';
string: QUOTE (CH | WS)*? QUOTE;
QUOTE: '\'';
WS : [ \t\r\n]+;
CH: .;
The goal is to have langauge with both string literals and keywords.
The parsed string is follows:
print 'printed'
It should be parsed as keyword, then whitespace, then string literal.
It is parsed this way instead:
Obviously, it sees keyword print
inside string literal. This is because it has implicitly created parasitic rule for "print".
How to avoid/overcome this?
I don't wish to specify, that string literal can contain keywords, because it is logically incorrect.
Also I can't specify DOT lexer meta operator, because I don't wish to allow every token contained inside quotes (I don't want quote to occur there).
So, what to do?