terminal/datatype/parser rules in xtext

Question

I'm using xtext 2.4. What I want to do is a SQL-like syntax. The things confuse me are I'm not sure which things should be treated as terminal/datatype/parser rules. So far my grammar related to MyTerm is:

Model:
    (terms += MyTerm ';')*
;

MyTerm:
    constant=MyConstant | variable?='?'| collection_literal=CollectionLiteral 
;

MyConstant
    : string=STRING 
    | number=MyNumber
    | date=MYDATE 
    | uuid=UUID 
    | boolean=MYBOOLEAN
    | hex=BLOB
;

MyNumber:
    int=SIGNINT | float=SIGNFLOAT
;


SIGNINT returns ecore::EInt:
    '-'? INT
;


SIGNFLOAT returns ecore::EFloat:
    '-'? INT '.' INT;
;

CollectionLiteral:
    => MapLiteral | SetLiteral | ListLiteral
;

MapLiteral:
    '{' {MapLiteral} (entries+=MapEntry (',' entries+=MapEntry)* )? '}'
;

MapEntry:
    key=MyTerm ':' value=MyTerm
;

SetLiteral:
    '{' {SetLiteral} (values+=MyTerm (',' values+=MyTerm)* )+ '}'
;

ListLiteral:
    '[' {ListLiteral} ( values+=MyTerm (',' values+=MyTerm)* )? ']'
;

terminal MYDATE:
  '0'..'9' '0'..'9' '0'..'9' '0'..'9' '-'
  '0'..'9' '0'..'9' '-'
  '0'..'9' '0'..'9'
;

terminal HEX:
    'a'..'h'|'A'..'H'|'0'..'9'
;   

terminal UUID:
    HEX HEX HEX HEX HEX HEX HEX HEX '-'
    HEX HEX HEX HEX '-'
    HEX HEX HEX HEX '-'
    HEX HEX HEX HEX '-'
    HEX HEX HEX HEX HEX HEX HEX HEX HEX HEX HEX HEX
;

terminal BLOB:
    '0' ('x'|'X') HEX+
;

terminal MYBOOLEAN returns ecore::EBoolean:
    'true' | 'false' | 'TRUE' | 'FALSE'
;

Few questions:

How to define integer with sign? If I define another terminal rule terminal SIGNINT: '-'? '0'..'9'+;, antlr will complain about INT becoming unreachable. Therefore I define it as a datatype rule SIGNINT: '-'? INT; Is this the correct way to do it?
How to define float with sign? I did exactly the same as define integer with sign, SIGNFLOAT: '-'? INT '.' INT;, not sure if this is correct as well.
How to define a date rule? I want to use a parser rule to store year/month/day info in fields, but define it as MyDate: year=INT '-' month=INT '-' date=INT; antlr will complain Decision can match input such as "RULE_INT '-' RULE_INT '-' RULE_INT" using multiple alternatives: 2, 3 As a result, alternative(s) 3 were disabled for that input
I also have some other rules like

the following

RelationCompare:
    name=ID compare=COMPARE term=MyTerm
;

but a=4 won't be a valid RelationCompare because a and 4 will be treat as HEXs. I found this because if I change the relation to j=44 then it works. In this post it said terminal rule defined eariler will shadow those defined later. However, if I redefine terminal ID in my grammar, whether put it in front or after of terminal HEX, antlr will conplain The following token definitions can never be matched because prior tokens match the same input: RULE_HEX,RULE_MYBOOLEAN. This problem happens in k=0x00b as well. k=0xaab is valid but k=0x00b is not.

Any suggestion?

Sam Harwell Sam Harwell · Accepted Answer · 2013-08-07T00:20:32

How do you define an integer with sign?

Treat it as two separate tokens '-' and INT, and use a parser rule instead of a lexer rule.

How do you define a float with sign?

Treat it as two separate tokens '-' and FLOAT, and use a parser rule instead of a lexer rule.

How do you define a date rule?

Treat it as five separate tokens and use a parser rule instead of a lexer rule.

I don't know the answer to the last question since this is in Xtext as opposed to just ANTLR.

terminal/datatype/parser rules in xtext

2 Answers