1
votes

I'm playing around with Antlr, designing a toy language, which I think is where most people start! - I had a question on how best to think about switching on token type.

consider a 'function call' in the language, where a function can consume a string, number or variable - for example like the below (project() is the function call)

project("ABC") vs project(123) vs project($SOME_VARIABLE)

I have the alteration operator in my grammar, so the grammar parses the right thing, but in the visitor code, it would be nice to tell the difference between the three versions of the above.


   @Override
    public ASTRoot visitCreateproj(projectmgmtParser.CreateprojContext ctx) {


        try {
             s1 = ctx.STRING_LITERAL().getText();
        }catch(Exception e){}
        try{
             s2 = ctx.NUM().getText();
        }catch(Exception e){}
        System.out.println("Created Project via => "  + ctx.getChild(1).toString());
    }

The code above worked, depending on whether s1 or s2 are null, I can infer how I was called (with a literal or a number, I haven't shown the variable case above), but I'm interested if there is a better or more elegant way - for example switching on token type inside the visitor code to actually process the language.

The grammar I had for the above was

createproj: 'project('WS?(STRING_LITERAL|NUM)')';

and when I use the intellij antlr plugin, it seems to know the token type of the argument to the project() function - but I don't seem to be able to get to it from my code.

1

1 Answers

2
votes

You could do something like this:

createproj
 : 'project' '(' WS? param ')'
 ;

param
 : STRING_LITERAL 
 | NUM
 ;

and in your visitor code:

@Override
public ASTRoot visitCreateproj(projectmgmtParser.CreateprojContext ctx) {
  switch(ctx.param().start.getType()) {
    case YourLexerName.STRING_LITERAL:
      ...
    case YourLexerName.NUM:
      ...
    ...
  }
}

so by inlining the token in the grammar I had originally, I've lost the opportunity to inspect it in the visitor code?

No really, you could also do it like this:

createproj
 : 'project' '(' WS? param_token=(STRING_LITERAL | NUM) ')'
 ;

and could then do this:

@Override
public ASTRoot visitCreateproj(projectmgmtParser.CreateprojContext ctx) {
  switch(ctx.param_token.getType()) {
    case YourLexerName.STRING_LITERAL:
      ...
    case YourLexerName.NUM:
      ...
    ...
  }
}

Just make sure you don't mix lexer rules (tokens) and parser rules in your set param_token=( ... ). When it's a parser rule, ctx.param_token.getType() will fail (it must then be ctx.param_token.start.getType()). That is why I recommended adding an extra parser rule, because this would then still work:

param
 : STRING_LITERAL 
 | NUM
 | some_parser_rule
 ;