I am very new to ANTLR and am trying to understand how the Lexer and Parser rules work. I'm experiencing issues with a grammar I've written that seem to be related to lexer tokens with multiple characters being seen as "matches" even when only the first few characters actually match. To demonstrate this, I have written a simple ANTLR 3 Grammar:
grammar test;
options {
k=3;
}
@lexer::header { package test;}
@header {package test;}
sentence : (CHARACTER)*;
CHARACTER : 'a'..'z'|' ';
SPECIAL : 'special';
I'm using AntlrWorks to parse the following test input:
apple basic say sponsor speeds speckled specific wonder
The output I get is:
apple basic say nsor ds led ic wonder
It seems to me that the LEXER is using k=1 and therefore matching my SPECIAL token with anything that includes the two letters 'sp'. Once it encounters the letters 'sp', it then matches sucessive characters within the SPECIAL literal until the actual input fails to match the expected token - at which point it throws an error (consuming that character) and then continues with the rest of the sentence. Each error is of the form:
line 1:18 mismatched chracter 'o' expecting 'e'
However, this isn't the behaviour I'm trying to create. I wish to create a lexer token that matches the keyword ('special') - for use in other parser rules not included in this test example. However, I don't want other rules/input that just happens to include the same initial characters to be affected
To summarize:
- How do I actually set antlr 3 options (such as k=2 or k=3 etc)? It seems to me, at least, that the options I'm trying to use here aren't being set.
- Is there a better way to create parser or lexer rules to match a particular keyword in my input, without affecting processing of other parts of the input that don't contain a full match?