I'm developing a grammar for an old language.
The language is quite complex but I want to focus on a specific issue, so I made a light version of it. The light version allow to specify assign statements and simple expressions like mathematical operations or strings concatenation.
Like this:
@assign[@var1 (1+3)*2]
@assign[@var2 "foo" $ "bar"]
Note: Inside an assignment statement, variables may not start with the @ char. The statement can also be written on multiple lines, so the following assignments are equivalent:
@assign[@var2 "foo" $ "bar"]
@assign[var2 "foo" $ "bar"]
@assign
[@var2 "foo"
$ "bar"]
@assign
[var2 "foo"
$ "bar"]
In this language you can also print out the value of the variable. The problem is that there isn't a specific command (like @print[...]), it's sufficient to write the variable. Like this:
@var1 @var2
So, output for code
@assign[@var1 (1+3)*2]
@assign[@var2 "foo" $ "bar"]
@var1 @var2
is:
8 foobar
Here is the grammar that I've written so far starting from Mu grammar file:
grammar Grammar;
////////////////
// PARSER //
////////////////
file
: block EOF
;
block
: stat*
;
stat
: assignment
| print
;
assignment
: ASSIGN LBRACKET variable expr RBRACKET
;
print
: AT ID
;
expr
: expr CONCAT expr #concatExpr
| expr MUL expr #mulExpr
| expr DIV expr #divExpr
| expr ADD expr #addExpr
| expr SUB expr #subExpr
| atom #atomExpr
;
variable
: AT ID
| ID
;
atom
: LPARENS expr RPARENS #parExpr
| INT #intAtom
| STRING #stringAtom
| variable #variableAtom
;
///////////////
// LEXER //
///////////////
ASSIGN : AT 'assign' ;
AT : '@' ;
ID : [a-zA-Z_] [a-zA-Z_0-9]* ;
INT
: [0-9]+
;
LBRACKET : '[' ;
RBRACKET : ']' ;
LPARENS : '(' ;
RPARENS : ')' ;
CONCAT : '$' ;
ADD : '+' ;
SUB : '-' ;
MUL : '*' ;
DIV : '/' ;
WS : [ \t\r\n] -> skip ;
COMMENT : '[*' .*? '*]' -> skip ;
STRING : '"' (~["\r\n] | '""')* '"' ;
To print out the variables I developed a customized visitor. Visiting visitPrint method, I know that there are two tokens: AT and ID.
Now the question.
How can I modify my grammar so that the following example code
@assign[@var1 "one"]
@assign[var2 "two"]
@assign[var3 var1 $ var2]
Value of var3 is: @var3
generate this output?
Value of var3 is: onetwo
The goal is to make grammar able to print some free text.
I imagine that I've to rewrite the print rule. But... how?
print
: AT ID
| ?????? //Help!
;
In this case, the goal is also that "Value of var3 is: " should be a single token (not one token for each word).
This is surely the wrong way!
print
: AT ID
| .+?
;
Thanks in advance.