I am making a parser for a programming language I'm building, and am running into an issue: ANTLR seems intent in not matching variable declarations.
Here is the grammar:
// Define a grammar called simc
grammar simc;
//Parser Rules
program : statement+ ;
statement : declaration | assignment | ( expression SEMICOLON ) ;
expression : LEFTPAREN expression RIGHTPAREN #parenthesisExp
//Math
| <assoc=right> expression '^' expression #powerExp
| expression (ASTERISK|SLASH) expression #mulDivExp
| expression (PLUS|MINUS) expression #addSubExp
//Bool Operations
| expression EQUALITY expression #equalCompExp
| expression NONEQUALITY expression #notequalCompExp
| expression GREATERTHAN expression #greaterCompExp
| expression LESSTHAN expression #lessCompExp
| expression GREATERTHANOREQUALTO expression #greaterorequalCompExp
| expression LESSTHANOREQUALTO expression #lessorequalCompExp
//Any value that isn't an expression itself
| value #valueExp
;
value : constvalue | functioncall | variable ;
functioncall : IDENTIFIER LEFTPAREN expression? ( COMMA expression )? RIGHTPAREN ;
declaration : typelabel variable EQUALS expression SEMICOLON ;
assignment : variable EQUALS expression SEMICOLON ;
constvalue : intvalue | floatvalue | stringvalue | boolvalue ;
typelabel : INTLABEL | FLOATLABEL | STRINGLABEL | BOOLLABEL ;
variable : IDENTIFIER ;
intvalue : INTVALUE ;
floatvalue : FLOATVALUE ;
stringvalue : STRINGVALUE ;
boolvalue : BOOLVALUE ;
//Lexer Rules
IDENTIFIER : [a-zA-Z][a-zA-Z0-9]* ;
LEFTPAREN : '(' ;
RIGHTPAREN : ')' ;
INTLABEL : I N T ;
FLOATLABEL : F L O A T ;
STRINGLABEL : S T R I N G ;
BOOLLABEL : B O O L ;
INTVALUE : [0-9]+ ;
FLOATVALUE : [0-9]+ ( PERIOD [0-9]+ F? | F ) ;
STRINGVALUE : QUOTE ( '\\"' | . )*? QUOTE ;
BOOLVALUE : ( T R U E ) | ( F A L S E ) ;
SEMICOLON : ';' ;
ASTERISK : '*' ;
SLASH : '/' ;
PLUS : '+' ;
MINUS : '-' ;
EQUALS : '=' ;
EQUALITY : '==' ;
NONEQUALITY : '!=' ;
GREATERTHAN : '>' ;
LESSTHAN : '<' ;
GREATERTHANOREQUALTO : '>=' ;
LESSTHANOREQUALTO : '<=' ;
COMMA : ',' ;
PERIOD : '.' ;
QUOTE : '"' ;
fragment A : [aA] ; // match either an 'a' or 'A'
fragment B : [bB] ;
fragment C : [cC] ;
fragment D : [dD] ;
fragment E : [eE] ;
fragment F : [fF] ;
fragment G : [gG] ;
fragment H : [hH] ;
fragment I : [iI] ;
fragment J : [jJ] ;
fragment K : [kK] ;
fragment L : [lL] ;
fragment M : [mM] ;
fragment N : [nN] ;
fragment O : [oO] ;
fragment P : [pP] ;
fragment Q : [qQ] ;
fragment R : [rR] ;
fragment S : [sS] ;
fragment T : [tT] ;
fragment U : [uU] ;
fragment V : [vV] ;
fragment W : [wW] ;
fragment X : [xX] ;
fragment Y : [yY] ;
fragment Z : [zZ] ;
WS : [ \r\t\n]+ -> skip ;
COMMENT : ( ( '/' '/' .*? ( '\r'|'\t'|'\n' ) ) | '/*' .*? '*/' ) -> skip ;
The grammar should, if I'm not mistaken, match the code int a = 5;
as a declaration of a variable. Instead, I get an empty statement (which I don't understand how that particular event is possible,) a statement marked as incorrect containing the int text (in my testing, it only worked for valid type names) and a correct assignment. To my best understanding, declarations should be found before assignments, right? Why does it match like this, and how can I fix it?