antlr: Best practice to integrate generated parser into the system

Question

Here is the background, I am trying to create a DSL to allow customer write simple scripts to query into our mongodb based database. I choose antlr to implement the DSL.

From my understanding (and pls let me know if it's not correct) there are 2 approaches to integrate antlr generated parser into the system:

Embed code into the grammar file so that the generated parser could be used directly to make query to the database and return result in a certain format (e.g. json encoded)
Keep the parser purely a parser, after feed the DSL file to it, and construct the query in another class by retrieving the AST from generated parser class

So antlrers, which one do you think is the way I as an antlr newbie should go? Can you list the pros and cos of each approach, or you have other way to recommend?

There is no single "best way". Personally, I like to keep my grammar as empty as possible (as little embedded code as is possible). I created a few blog posts on how to parse and evaluate a small programming language (using ANTLR) here: bkiers.blogspot.nl/2011/03/… — Bart Kiers

Sergey Kalinichenko Sergey Kalinichenko · Accepted Answer · 2012-06-26T03:11:55

You should not embed business logic into your parser, because of potential maintenance headaches: you do not want to recompile your grammar to fix bugs in business logic, let alone the nightmarish prospects of having to debug your business logic inside a generated parser. So option one should be out of the question for anything other than the smallest toy projects.

Option two can be implemented in more than one way: you can (A) have ANTLR generate AST nodes for you, and write a tree parser, or (B) you could skip the tree parser, and produce your own representation in the actions of the parser. I tried both approaches on production projects, both in Java and in C#, and both these approaches worked very well. I think the choice between (A) and (B) is largely a matter of your personal taste, as long as the business logic is kept out of the parser.

antlr: Best practice to integrate generated parser into the system

1 Answers