1
votes

I am new to ANTLR and I am trying to get this grammar working:

grammar TemplateGrammar;

//Parser Rules 

start
    : block
    | statement
    | expression
    | parExpression
    | primary
    ;

block
    : LBRACE statement* RBRACE
    ;

statement
    : block
    | IF parExpression statement (ELSE statement)?
    | expression
    ;

parExpression
    : LPAREN expression RPAREN
    ;

expression
    : primary #PRIMARY
    | number op=('*'|'/') number            #MULDIV
    | number op=('+'|'-') number            #ADDSUB
    | number op=('>='|'<='|'>'|'<') number  #GRLWOREQUALS
    | expression op=('='|'!=') expression   #EQDIFF
    ;

primary
    :   parExpression
    |   literal
    ;

literal
    :   number  #NumberLiteral
    |   string  #StringLiteral
    |   columnName #ColumnNameLiteral
    ;

number
    :   DecimalIntegerLiteral       #DecimalIntegerLiteral
    |   DecimalFloatingPointLiteral #FloatLiteral
    ;

string
    :   '"' StringChars? '"'
    ;

columnName
    :   '[' StringChars? ']'
    ;

// Lexer Rules

//Integers
 DecimalIntegerLiteral
    :   DecimalNumeral
    ;

 fragment
 DecimalNumeral
    :   '0'
    |   NonZeroDigit (Digits? | Underscores Digits)
    ;

 fragment
 Digits
    :   Digit (DigitOrUnderscore* Digit)?
    ;

 fragment
 Digit
    :   '0'
    |   NonZeroDigit
    ;

 fragment
 NonZeroDigit
    :   [1-9]
    ;

 fragment
 DigitOrUnderscore
    :   Digit
    |   '_'
    ;

 fragment
 Underscores
    :   '_'+
    ;

//Floating point
DecimalFloatingPointLiteral
    :   Digits '.' Digits? ExponentPart?
    |   '.' Digits ExponentPart?
    |   Digits ExponentPart
    |   Digits
    ;

fragment
ExponentPart
    :   ExponentIndicator SignedInteger
    ;

fragment
ExponentIndicator
    :   [eE]
    ;

fragment
SignedInteger
    :   Sign? Digits
    ;

fragment
Sign
    :   [+-]
    ;

//Strings

StringChars
    :   StringChar+
    ;

fragment
StringChar
    :   ~["\\]
    |   EscapeSequence
    ;

fragment
EscapeSequence
    :   '\\' [btnfr"'\\]
    ;

//Separators
LPAREN          : '(';
RPAREN          : ')';
LBRACE          : '{';
RBRACE          : '}';
LBRACK          : '[';
RBRACK          : ']';
COMMA           : ',';
DOT             : '.';

//Keywords
IF              : 'IF';
ELSE            : 'ELSE';
THEN            : 'THEN';

//Operators
PLUS            : '+';
MINUS           : '-';
MULTIPLY        : '*';
DIVIDE          : '/';
EQUALS          : '=';
DIFFERENT       : '!=';
GRTHAN          : '>';
GROREQUALS      : '>=';
LWTHAN          : '<';
LWOREQUALS      : '<=';
AND             : '&';
OR              : '|';

WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ -> skip ;


When I put "Test" in the input, it is working and returning the String "Test".

Here is what I get in the IParseTree when I put "Test" in the input:

"(start (statement (expression (primary (literal (string \" Test \"))))))"


But when I put [Test] (wich is almost the same as "Test" but with braces instead of quotes), the parser does not recognize the token...

Here is the IParseTree I get when I put [Tree]:

"(start [Test])"


Same with numbers, it does well recognize lonely numbers such as 1, 123, 12.5, etc. but not expressions like 1+2...

Do you have any idea why the parser isn't recognizing columnNames rule but does work well with the string rule?

2

2 Answers

1
votes

Probably because "StringChar" is defined incorrectly for your purpose? It doesn't handle "]"

Perhaps you want to define StringChar as:

fragment
StringChar
:   ~["\\\]]
|   EscapeSequence
;

If it were my grammar, I'd define a QuotedStringChar as you have for quoted strings, and define BracketStringChar as ~[\]\\] to use for your bracket column names.

Welcome to debugging grammars at the lexical level, and defining different types of "quotes" for different types of strings. This is pretty common. (You should see Ruby, where you can define the string quote at the beginning of the string, ick.).

0
votes

I finnaly got it working by putting:

QuotedStringChars
    :   '"' ~[\"]+ '"'
    ;

BracketStringChars
    :   '[' ~[\]]+ ']'
    ;

To take any characters between quotes or brackets. Then :

primary
    :   literal #PrimLiteral
    |   number  #PrimNumber
    ;

literal
    :   QuotedStringChars   #OneString
    |   BracketStringChars  #ColumnName
    |   number              #NUMBER
    ;

number
    :   DecimalIntegerLiteral       #DecimalIntegerLiteral
    |   DecimalFloatingPointLiteral #FloatLiteral
    ;

The literal rule helps to distinguish quoted string, bracket string and numbers.

There is a duplication of number in primary and literal rules because I need a different behavior in my application for each one.

I managed this with the good advices of Ira Baxter :)

Hope this will help other newbies to ANTLR like me to have a better understanding :)