2
votes

I am trying a get semantic predicate to work. This seems straight forward but somehow doesn't work, based on a boolean condition I need to either execute a rule (which spits out an AST) or just or manually construct one

below is the parser rule.

displayed_column
  :   
    {columnAliases.containsKey($text)}? 
    =>-> ^(PROPERTY_LIST ^(PROPERTY IDENTIFIER[columnAliases.get($text)])) 
  | sql_expression
  ;

I have tried all gated and disambiguating as well but while running through the code, it always goes to the second rule (sql_expression).

Can anyone please help me out ?

Thanks

EDIT: I just realized that $text is empty while the predicate is running which is why its always matching the second rule. I changed the rule to this and it works

displayed_column
  :
        sql_expression
        -> {columnAliases.containsKey($text)}? ^(PROPERTY_LIST ^(PROPERTY IDENTIFIER[columnAliases.get($text)])) 
        -> sql_expression

However I ran into a different problem now, I realized that manually constructing the tree will not work, I need to re-run the rule displayed_column again with the new text (value from columnAliases Map), is that possible?

This was my original question https://stackguides.com/questions/14170541/antlr-dynamic-input-stream-modification-during-parsing

Basically I am trying to interactively parse and interpret sql like statments ex:

select a.b.c from pool;
select min(abc.def[*]) from pool;

Since the column names might be a bit long I have given the user a preference to alias column names (through a different command), for example the user might set a preference and then run his commands

set column_alias a.b.c d;
select d from pool;

Now while parsing I inject the preferences (Map) into the generated parser and I am trying to replace/map the new column back to the original column and then continue interpreting. Handling it in the parser seemed like the only option to me since I thought it would be difficult to do it the tree grammer since the column spans multiple rules.

I could post the entire grammar but its a bit too long, here is a scaled down version of it

select_stmt:
  : 'select' displayed_column 'from' pool
  ;

displayed_column
  : sql_expression 
  ;

sql_expression
  : term ( (PLUS^ | MINUS^) term)*
  ;

term  : factor ( (ASTERISK^ | DIVIDE^) factor)*
  ;

... <more_rules> ...

I am stuck on this, using string template to output a translated statement and then reparse seems like the only option to me, but this would entail rewriting the entire grammar to output templates (right now I have a combined grammar with outputs an AST and a tree grammar that interprets it). It would be greatly appreciated if someone can tell me way which is less intrusive.

Thanks again.

1
You're either matching nothing (the first alternative), or you match a sql_expression (the second alternative). So, matching nothing will only let columnAliases.containsKey($text) evaluate to true if your Map (assuming it is a Map) contains an empty string as key. Could you provide some more context? Possibly give some example input strings and desired AST's as output? - Bart Kiers
Hi Bart, I've added more details to the question. - jack_carver

1 Answers

4
votes

Instead of storing the strings as values, why not store the actual AST's in your map? These AST's can then be injected by wrapping them inside { ... } in your rewrite rule(s).

A demo:

grammar T;

options {
  output=AST;
  ASTLabelType=CommonTree;
}

tokens {
  STATS;
  DISPLAYED_COLUMN;
  NAME;
  SELECT;
}

@parser::header {
  import java.util.Map;
  import java.util.HashMap;
}

@parser::members {
  private Map<String, CommonTree> aliases = new HashMap<String, CommonTree>();
}

parse
 : (stmt ';')+ EOF -> ^(STATS stmt+)
 ;

stmt
 : set_stmt
 | select_stmt
 ;

set_stmt
 : 'set' 'alias' name Id {aliases.put($Id.text, $name.tree);} -> /* AST can be omitted */
 ;

select_stmt
 : 'select' displayed_column 'from' name -> ^(SELECT displayed_column name)
 ;

displayed_column
 : sql_expression -> {aliases.containsKey($text)}? ^(DISPLAYED_COLUMN {aliases.get($text)})
                  ->                               ^(DISPLAYED_COLUMN sql_expression)
 ;

sql_expression
 : term (('+' | '-')^ term)*
 ;

term
 : factor (('*' | '/')^ factor)*
 ;

factor
 : Num
 | name
 | '(' sql_expression ')'
 ;

name
 : Id ('.' Id)* -> ^(NAME Id+)
 ;

Id    : 'a'..'z'+;
Num   : '0'..'9'+;
Space : (' ' | '\t' | '\r' | '\n')+ {skip();};

Parsing the input:

select d from pool;
set alias a.b.c d;
select d from pool;

would result in the following AST:

enter image description here

EDIT

Thanks Bart ! Only thing is I need to persist these preferences in a data store so that the user doesn't need to reenter them again, hoping I can serialize CommonTree.

:( alas, it is not Serializable.

In that case, you can store the values as strings, and create an AST on the fly using a small helper method createNameAST(String alias) and inject the AST this method creates:

grammar T;

options {
  output=AST;
  ASTLabelType=CommonTree;
}

tokens {
  STATS;
  DISPLAYED_COLUMN;
  NAME;
  SELECT;
}

@parser::header {
  import java.util.Map;
  import java.util.HashMap;
}

@parser::members {
  private Map<String, String> aliases = new HashMap<String, String>();

  private CommonTree createNameAST(String alias) {
    try {
      TLexer lexer = new TLexer(new ANTLRStringStream(aliases.get(alias)));
      TParser parser = new TParser(new CommonTokenStream(lexer));
      return (CommonTree)parser.name().getTree();  
    } catch(Exception e) {
      throw new RuntimeException(e);
    }
  }
}

parse
 : (stmt ';')+ EOF -> ^(STATS stmt+)
 ;

stmt
 : set_stmt
 | select_stmt
 ;

set_stmt
 : 'set' 'alias' name Id {aliases.put($Id.text, $name.text);} -> /* AST can be omitted */
 ;

select_stmt
 : 'select' displayed_column 'from' name -> ^(SELECT displayed_column name)
 ;

displayed_column
 : sql_expression -> {aliases.containsKey($text)}? ^(DISPLAYED_COLUMN {createNameAST($text)})
                  ->                               ^(DISPLAYED_COLUMN sql_expression)
 ;

sql_expression
 : term (('+' | '-')^ term)*
 ;

term
 : factor (('*' | '/')^ factor)*
 ;

factor
 : Num
 | name
 | '(' sql_expression ')'
 ;

name
 : Id ('.' Id)* -> ^(NAME Id+)
 ;

Id    : 'a'..'z'+;
Num   : '0'..'9'+;
Space : (' ' | '\t' | '\r' | '\n')+ {skip();};

In case you're using the debugger from ANTLRWorks: it might have an issue with the method createNameAST because it uses a TParser. Create a small test case by hand:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = 
        "select d from pool; \n" + 
        "set alias a.b.c.x d; \n" +
        "select d from pool;";
    TLexer lexer = new TLexer(new ANTLRStringStream(src));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();  
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

and run this all on the command line:

java -cp antlr-3.3.jar org.antlr.Tool T.g 
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main > ast.dot

and you'll get a DOT file that represents the same AST posted earlier.