1
votes

The scenario is that I want to create a BASIC (high level) language using ANTRL4.

The test input below is the creation of a variable called C$ and assigning an integer value. The value assignment works. The print statement works except where concatenating the variable to it:-

     ************ TEST CASE ****************

$C=15;

print "dangerdanger!"; # print works

print "Number of GB left=" + $C;

Parse Tree Inspector

Using a Parse Tree Inspector I can see assignments are working fine but when it gets to the identification of the variable in the string it seems there is a mismatched input '+' expecting STMTEND.

I wondered if anyone could help me out here and see what adjustment I need to make to my rules and grammar to solve this issue.

Many thanks in advance.

Kevin PS. As a side issue I would rather have C$ than $C but early days...

********RULES************


VARNAME : '$'('A'..'Z')* 
        ;

CONCAT  : '+'
        ;
STMTEND : SEMICOLON NEWLINE* | NEWLINE+
        ;
STRING  : SQUOTED_STRING (CONCAT SQUOTED_STRING | CONCAT VARNAME)*
    | DQUOTED_STRING (CONCAT DQUOTED_STRING | CONCAT VARNAME)*
        ; 
fragment SQUOTED_STRING : '\'' (~['])* '\''
    ;

fragment DQUOTED_STRING  
    :  '"' ( ESC_SEQ| ~('\\'|'"') )* '"'  
    ;  

fragment ESC_SEQ  
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')  
    |   UNICODE_ESC  
    |   OCTAL_ESC  
    ;  

fragment OCTAL_ESC  
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')  
    |   '\\' ('0'..'7') ('0'..'7')  
    |   '\\' ('0'..'7')  
    ;  

fragment HEX_DIGIT : '0x' ('0'..'9' | 'a'..'f' | 'A'..'F')+
    ;

fragment UNICODE_ESC :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT  
    ;  

SEMICOLON : ';' 
    ;

NEWLINE : '\r'?'\n' 


************GRAMMAR************

print_command
    :   PRINT STRING STMTEND #printCommandLabel
    ;

assignment
    : VARNAME EQUALS INTEGER STMTEND #assignInteger 
    | VARNAME EQUALS STRING STMTEND #assignString
    ;
1

1 Answers

1
votes

You shouldn't try to create concat-expressions inside your lexer: that is the responsibility of the parser. Something like this should do it:

print_command
 :   PRINT STRING STMTEND #printCommandLabel
 ;

assignment
 : VARNAME EQUALS expression STMTEND
 ;

expression
 : expression CONCAT expression
 | INTEGER
 | STRING
 | VARNAME
 ;

CONCAT
 : '+'
 ;

VARNAME 
 : '$'('A'..'Z')* 
 ;

STMTEND 
 : SEMICOLON NEWLINE* 
 | NEWLINE+
 ;

STRING
 : SQUOTED_STRING
 | DQUOTED_STRING
 ; 

fragment SQUOTED_STRING
 : '\'' (~['])* '\''
 ;

fragment DQUOTED_STRING  
 : '"' ( ESC_SEQ| ~('\\'|'"') )* '"'  
 ;  

fragment ESC_SEQ  
 : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')  
 | UNICODE_ESC  
 | OCTAL_ESC  
 ;  

fragment OCTAL_ESC  
 : '\\' ('0'..'3') ('0'..'7') ('0'..'7')  
 | '\\' ('0'..'7') ('0'..'7')  
 | '\\' ('0'..'7')  
 ;  

fragment HEX_DIGIT : '0x' ('0'..'9' | 'a'..'f' | 'A'..'F')+;

fragment UNICODE_ESC :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT;  

fragment SEMICOLON : ';';

fragment NEWLINE : '\r'?'\n';