2
votes

I am trying a get semantic predicate to work. This seems straight forward but somehow doesn't work, based on a boolean condition I need to either execute a rule (which spits out an AST) or just or manually construct one

below is the parser rule.

displayed_column
  :   
    {columnAliases.containsKey($text)}? 
    =>-> ^(PROPERTY_LIST ^(PROPERTY IDENTIFIER[columnAliases.get($text)])) 
  | sql_expression
  ;

I have tried all gated and disambiguating as well but while running through the code, it always goes to the second rule (sql_expression).

Can anyone please help me out ?

Thanks

EDIT: I just realized that $text is empty while the predicate is running which is why its always matching the second rule. I changed the rule to this and it works

displayed_column
  :
        sql_expression
        -> {columnAliases.containsKey($text)}? ^(PROPERTY_LIST ^(PROPERTY IDENTIFIER[columnAliases.get($text)])) 
        -> sql_expression

However I ran into a different problem now, I realized that manually constructing the tree will not work, I need to re-run the rule displayed_column again with the new text (value from columnAliases Map), is that possible?

This was my original question https://stackguides.com/questions/14170541/antlr-dynamic-input-stream-modification-during-parsing

Basically I am trying to interactively parse and interpret sql like statments ex:

select a.b.c from pool;
select min(abc.def[*]) from pool;

Since the column names might be a bit long I have given the user a preference to alias column names (through a different command), for example the user might set a preference and then run his commands

set column_alias a.b.c d;
select d from pool;

Now while parsing I inject the preferences (Map) into the generated parser and I am trying to replace/map the new column back to the original column and then continue interpreting. Handling it in the parser seemed like the only option to me since I thought it would be difficult to do it the tree grammer since the column spans multiple rules.

I could post the entire grammar but its a bit too long, here is a scaled down version of it

select_stmt:
  : 'select' displayed_column 'from' pool
  ;

displayed_column
  : sql_expression 
  ;

sql_expression
  : term ( (PLUS^ | MINUS^) term)*
  ;

term  : factor ( (ASTERISK^ | DIVIDE^) factor)*
  ;

... <more_rules> ...

I am stuck on this, using string template to output a translated statement and then reparse seems like the only option to me, but this would entail rewriting the entire grammar to output templates (right now I have a combined grammar with outputs an AST and a tree grammar that interprets it). It would be greatly appreciated if someone can tell me way which is less intrusive.

Thanks again.

1
You're either matching nothing (the first alternative), or you match a sql_expression (the second alternative). So, matching nothing will only let columnAliases.containsKey($text) evaluate to true if your Map (assuming it is a Map) contains an empty string as key. Could you provide some more context? Possibly give some example input strings and desired AST's as output?Bart Kiers
Hi Bart, I've added more details to the question.jack_carver

1 Answers

4
votes

Instead of storing the strings as values, why not store the actual AST's in your map? These AST's can then be injected by wrapping them inside { ... } in your rewrite rule(s).

A demo:

grammar T;

options {
  output=AST;
  ASTLabelType=CommonTree;
}

tokens {
  STATS;
  DISPLAYED_COLUMN;
  NAME;
  SELECT;
}

@parser::header {
  import java.util.Map;
  import java.util.HashMap;
}

@parser::members {
  private Map<String, CommonTree> aliases = new HashMap<String, CommonTree>();
}

parse
 : (stmt ';')+ EOF -> ^(STATS stmt+)
 ;

stmt
 : set_stmt
 | select_stmt
 ;

set_stmt
 : 'set' 'alias' name Id {aliases.put($Id.text, $name.tree);} -> /* AST can be omitted */
 ;

select_stmt
 : 'select' displayed_column 'from' name -> ^(SELECT displayed_column name)
 ;

displayed_column
 : sql_expression -> {aliases.containsKey($text)}? ^(DISPLAYED_COLUMN {aliases.get($text)})
                  ->                               ^(DISPLAYED_COLUMN sql_expression)
 ;

sql_expression
 : term (('+' | '-')^ term)*
 ;

term
 : factor (('*' | '/')^ factor)*
 ;

factor
 : Num
 | name
 | '(' sql_expression ')'
 ;

name
 : Id ('.' Id)* -> ^(NAME Id+)
 ;

Id    : 'a'..'z'+;
Num   : '0'..'9'+;
Space : (' ' | '\t' | '\r' | '\n')+ {skip();};

Parsing the input:

select d from pool;
set alias a.b.c d;
select d from pool;

would result in the following AST:

enter image description here

EDIT

Thanks Bart ! Only thing is I need to persist these preferences in a data store so that the user doesn't need to reenter them again, hoping I can serialize CommonTree.

:( alas, it is not Serializable.

In that case, you can store the values as strings, and create an AST on the fly using a small helper method createNameAST(String alias) and inject the AST this method creates:

grammar T;

options {
  output=AST;
  ASTLabelType=CommonTree;
}

tokens {
  STATS;
  DISPLAYED_COLUMN;
  NAME;
  SELECT;
}

@parser::header {
  import java.util.Map;
  import java.util.HashMap;
}

@parser::members {
  private Map<String, String> aliases = new HashMap<String, String>();

  private CommonTree createNameAST(String alias) {
    try {
      TLexer lexer = new TLexer(new ANTLRStringStream(aliases.get(alias)));
      TParser parser = new TParser(new CommonTokenStream(lexer));
      return (CommonTree)parser.name().getTree();  
    } catch(Exception e) {
      throw new RuntimeException(e);
    }
  }
}

parse
 : (stmt ';')+ EOF -> ^(STATS stmt+)
 ;

stmt
 : set_stmt
 | select_stmt
 ;

set_stmt
 : 'set' 'alias' name Id {aliases.put($Id.text, $name.text);} -> /* AST can be omitted */
 ;

select_stmt
 : 'select' displayed_column 'from' name -> ^(SELECT displayed_column name)
 ;

displayed_column
 : sql_expression -> {aliases.containsKey($text)}? ^(DISPLAYED_COLUMN {createNameAST($text)})
                  ->                               ^(DISPLAYED_COLUMN sql_expression)
 ;

sql_expression
 : term (('+' | '-')^ term)*
 ;

term
 : factor (('*' | '/')^ factor)*
 ;

factor
 : Num
 | name
 | '(' sql_expression ')'
 ;

name
 : Id ('.' Id)* -> ^(NAME Id+)
 ;

Id    : 'a'..'z'+;
Num   : '0'..'9'+;
Space : (' ' | '\t' | '\r' | '\n')+ {skip();};

In case you're using the debugger from ANTLRWorks: it might have an issue with the method createNameAST because it uses a TParser. Create a small test case by hand:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = 
        "select d from pool; \n" + 
        "set alias a.b.c.x d; \n" +
        "select d from pool;";
    TLexer lexer = new TLexer(new ANTLRStringStream(src));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();  
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

and run this all on the command line:

java -cp antlr-3.3.jar org.antlr.Tool T.g 
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main > ast.dot

and you'll get a DOT file that represents the same AST posted earlier.