The second alternative ((1-9)(0-9)) of the following parser rule results in two nodes in the abstract syntax tree.
oneToHundred
: ('1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9')
| ('1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9')('0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9')
| '100'
;
(side node: "Lexing" the numbers into Digit-Tokens isn't applicable for me, since sometimes a sub-range of 0-9 like 2-4 can represent sth. very different than a digit(which btw I can't influence).)
So for 15 I get two nodes one and five instead of fifteen but I would like to get this as one number represented by one node.
I can not do this with the lexer on the token-level since depending on the context e.g. 15 can mean two very different things either a "one-symbol and a five-symbol" (which definitely should be two nodes) or "fifteen" and according to this post context-sensitivity should be left to the parser.
(Edit for clarification:)
Example for context-sensitivity:
the Input should get split up/is separated by semi-colons
Input:
11;2102;34%;P11o
this would be split into four parts and
11 - would not be a number but one '1'-symbol and another '1'-symbol
2102 - would not be a number but: '2'-symbol '1'-symbol '0'-symbol '2'-symbol
34% - now here 34 would be the number thirtyfour
P11o: 'P'-symbol '1'-symbol '1'-symbol 'o'-symbol
Of these four blocks 34% will get recognized as a percent-block by a parser rule and the others as symbol-blocks. So the AST should look sth like this:
SYMBOL
1
1
SYMBOL
2
1
0
2
PERCENT
34
SYMBOL
P
1
1
o
The target is C#:
options {
language=CSharp3;
output=AST;
}
I'm an Antlr-noob, so is there a good way to merge these two nodes with the parser or am I better of adding an imaginary token and concatenating the two digits "manually" in C# after parsing?