I have given up fixing C# grammar from version for ANTLR3.2 to version for ANTLR4, now I want to make Java Parser and Visitor. The Java grammar for ANTLR4 downloaded from Github: https://github.com/antlr/grammars-v4/blob/master/java/Java.g4 is written for any target language, but some code is for Java target and it does not work with C#. I am talking about these lexar rules:
fragment
JavaLetter
: [a-zA-Z$_] // these are the "java letters" below 0xFF
| // covers all characters above 0xFF which are not a surrogate
~[\u0000-\u00FF\uD800-\uDBFF]
// {Character.isJavaIdentifierStart(_input.LA(-1))}?
| // covers UTF-16 surrogate pairs encodings for U+10000 to U+10FFFF
[\uD800-\uDBFF] [\uDC00-\uDFFF]
//{Character.isJavaIdentifierStart(Character.toCodePoint((char)_input.LA(-2), (char) _input.LA (-1)))}?
;
fragment
JavaLetterOrDigit
: [a-zA-Z0-9$_] // these are the "java letters or digits" below 0xFF
| // covers all characters above 0xFF which are not a surrogate
~[\u0000-\u00FF\uD800-\uDBFF]
// {Character.isJavaIdentifierPart(_input.LA(-1))}?
| // covers UTF-16 surrogate pairs encodings for U+10000 to U+10FFFF
[\uD800-\uDBFF] [\uDC00-\uDFFF]
//{char.isJavaIdentifierPart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))}?
;
I have commented the target codes starting with {Character.isJavaIdentifier...} and it is now OK. I was wondering why is it there!? I think it returns true if the token before or 2 tokens before (in case LA(-2)) are IdentifierPart, but what is the action code for? In C# Char object does not support static method isIdentifierPart or something like that...
My question is: If I cancel the action code, will the parser fail on a specific identifier name during the parsing of a Java input code? If YES, how can I substitute it for C# target?
Thanks for replies! PK