Edit:
Here is the updated tree and parser grammars:
parser grammar:
options {
language = CSharp2;
output=AST;
}
tokens {
UNARY_MINUS;
CALL;
}
program : (function)* main_function
;
function: 'function' IDENTIFIER '(' (parameter (',' parameter)*)? ')' 'returns' TYPE declaration* statement* 'end' 'function'
-> ^('function' IDENTIFIER parameter* TYPE declaration* statement*)
;
main_function
: 'function' 'main' '(' ')' 'returns' TYPE declaration* statement* 'end' 'function'
-> ^('function' 'main' TYPE declaration* statement*)
;
parameter
: 'param' IDENTIFIER ':' TYPE
-> ^('param' IDENTIFIER TYPE)
;
declaration
: 'variable' IDENTIFIER ( ',' IDENTIFIER)* ':' TYPE ';'
-> ^('variable' TYPE IDENTIFIER+ )
| 'array' array ':' TYPE ';'
-> ^('array' array TYPE)
;
statement
: ';'! | block | assignment | if_statement | switch_statement | while_do_statement | for_statement | call_statement | return_statement
;
call_statement
: call ';'!
;
return_statement
: 'return' expression ';'
-> ^('return' expression)
;
block : 'begin' declaration* statement* 'end'
-> ^('begin' declaration* statement*)
| '{' declaration* statement* '}'
-> ^('{' declaration* statement*)
;
assignment
: IDENTIFIER ':=' expression ';'
-> ^(':=' IDENTIFIER expression )
| array ':=' expression ';'
-> ^(':=' array expression)
;
array : IDENTIFIER '[' expression (',' expression)* ']'
-> ^(IDENTIFIER expression+)
;
if_statement
: 'if' '(' expression ')' 'then' statement ('else' statement)? 'end' 'if'
-> ^('if' expression statement statement?)
;
switch_statement
: 'switch' '(' expression ')' case_part+ ('default' ':' statement)? 'end' 'switch'
-> ^('switch' expression case_part+ statement?)
;
case_part
: 'case' literal (',' literal)* ':' statement
-> ^('case' literal+ statement)
;
literal
: INTEGER | FLOAT | BOOLEAN | STRING
;
while_do_statement
: 'while' '(' expression ')' 'do' statement 'end' ' while'
-> ^('while' expression statement)
;
for_statement
: 'for' '(' IDENTIFIER ':=' expression 'to' expression ')' 'do' statement 'end' 'for'
-> ^('for' IDENTIFIER expression expression statement)
;
expression
: conjuction ( 'or'^ conjuction)*
;
conjuction
: equality ('and'^ equality)*
;
equality: relation (('=' | '/=')^ relation)?
;
relation: addition (('<' | '<=' | '>' | '>=')^ addition)?
;
addition: multiplication (('+' | '-')^ multiplication)*
;
multiplication
: unary_operation (('*' | '/' | '%')^ unary_operation)*
;
unary_operation
: '-' primary
-> ^(UNARY_MINUS primary)
| 'not' primary
-> ^('not' primary)
| primary
;
primary : IDENTIFIER
| array
| literal
| '('! expression ')'!
| '(' TYPE ')' '(' expression ')'
-> ^(TYPE expression)
| call
;
call : IDENTIFIER '(' arguments ')'
-> ^(CALL IDENTIFIER arguments)
;
arguments
: (expression (','! expression)*)?
;
BOOLEAN : 'true' | 'false'
;
T YPE : 'integer' | 'boolean' | 'float' | 'string' | 'array' | 'void'
;
IDENTIFIER : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
INTEGER : '0'..'9'+
;
FLOAT
: ('0'..'9')+ '.' ('0'..'9')+
;
COMMENT
: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
| '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
STRING
: '"' .* '"'
;
And here is the updated tree grammar (I altered expressions, and so on...):
options {
language = 'CSharp2';
//tokenVocab= token vocab needed
ASTLabelType=CommonTree; // what is Java type of nodes?
}
program : (function)* main_function
;
function: ^('function' IDENTIFIER parameter* TYPE declaration* statement*)
;
main_function
: ^('function' 'main' TYPE declaration* statement*)
;
parameter
: ^('param' IDENTIFIER TYPE)
;
declaration
: ^('variable' TYPE IDENTIFIER+)
| ^('array' array TYPE )
;
statement
: block | assignment | if_statement | switch_statement | while_do_statement | for_statement | call_statement | return_statement
;
call_statement
: call
;
return_statement
: ^('return' expression)
;
block : ^('begin' declaration* statement*)
| ^('{' declaration* statement*)
;
assignment
: ^(':=' IDENTIFIER expression )
| ^(':=' array expression)
;
array : ^(IDENTIFIER expression+)
;
if_statement
: ^('if' expression statement statement?)
;
switch_statement
: ^('switch' expression case_part+ statement?)
;
case_part
: ^('case' literal+ statement)
;
literal
: INTEGER | FLOAT | BOOLEAN | STRING
;
while_do_statement
: ^('while' expression statement)
;
for_statement
: ^('for' IDENTIFIER expression expression statement)
;
expression
: ^('or' expression expression)
| ^('and' expression expression)
| ^('=' expression expression)
| ^('/=' expression expression)
| ^('<' expression expression)
| ^('<=' expression expression)
| ^('>' expression expression)
| ^('>=' expression expression)
| ^('+' expression expression)
| ^('-' expression expression)
| ^(UNARY_MINUS expression)
| ^('not' expression)
| IDENTIFIER
| array
| literal
| ^(TYPE expression)
| call
;
call : ^(CALL IDENTIFIER arguments)
;
arguments
: (expression (expression)*)?
;
I succesfluly generated tree graph with DOTTreeGenerator and StringTemplate classes so it seems that all is working at the moment. But any suggestions (about bad habits or something else in this grammars) are appreciated since I don't have a lot of experience with ANTLR or language recognition.
See updates on http://vladimir-radojicic.blogspot.com