I'm looking for a grammar for analyzing two type of sentences, that means words separated by white spaces:
- ID1: sentences with words not beginning with numbers
- ID2: sentences with words not beginning with numbers and numbers
Basically, the structure of the grammar should look like
ID1 separator ID2
ID1: Word can contain number like Var1234 but not start with a number
ID2: Same as above but 1234 is allowed
separator: e. g. '='
@Bart
I just tried to add two tokens '_'
and '"'
as lexer-rule Special
for later use in lexer-rule Word
.
Even I haven't used Special
in the following grammar, I get the following error in ANTLRWorks 1.4.2:
The following token definitions can never be matched because prior tokens match the same input: Special
But when I add fragment
before Special
, I don't get that error. Why?
grammar Sentence1b1;
tokens
{
TCUnderscore = '_' ;
TCQuote = '"' ;
}
assignment
: id1 '=' id2
;
id1
: Word+
;
id2
: ( Word | Int )+
;
Int
: Digit+
;
// A word must start with a letter
Word
: ( 'a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | Digit )*
;
Special
: ( TCUnderscore | TCQuote )
;
Space
: ( ' ' | '\t' | '\r' | '\n' ) { $channel = HIDDEN; }
;
fragment Digit
: '0'..'9'
;
Lexer-rule Special
shall then be used in lexer-rule Word
:
Word
: ( 'a'..'z' | 'A'..'Z' | Special ) ('a'..'z' | 'A'..'Z' | Special | Digit )*
;