3
votes

I'm writing a simple query parser implemented with ANTLR, and I'd like to retrieve a tree representation of the query. For this, I'm using options { output = AST }. Then I parse the query and get the tree (the code is python):

lexer = MyQueryLexer(char_stream)
tokens = antlr3.CommonTokenStream(lexer)
parser = MyQueryParser(tokens)
q = parser.query()  # query is my root rule
# do something with q.tree

Now, the tree I get from my parser does not include any rule names, just tokens in a flat list. I can use rewrite rules and ^/! get them into a tree structure, however they are still just tokens. For example, a part of one of the queries might be color = 1. That matches the following rule (simplified):

propcondition
: propertyname '=' value

That would be turned into:

# token type, text
5 color
20 =
8 1

With '='^ I can turn it into:

20 =
  5 color
  8 1

But I'd like that fragment to remember that it was matched as a "propcondition". The closest thing I could find was introducting fake tokens with rewrite rules:

propcondition
: propertyname '=' value -> ^(PROPCONDITION propertyname '=' value)
// ...

which then gives:

4 PROPCONDITION
  5 color
  20 =
  8 1

Is this the way to go? I have a feeling I'm missing some basic function here.

1

1 Answers

1
votes

Yes, that is the way to go. Note that if you create a root called PROPCONDITION, you can drop the '=' sign: such a condition will always ever have two children, right? A left-hand-side, and a right-hand-side.

propcondition
 : propertyname '=' value -> ^(PROPCONDITION propertyname value)
 ;

Creating the following tree:

enter image description here

And in case there are more operators, you can do:

propcondition
 : propertyname '=' value -> ^(PROPCONDITION ^('=' propertyname value))
 | propertyname '<' value -> ^(PROPCONDITION ^('<' propertyname value))
 | ...
 ;