1
votes

I'm new to compiler design and have been watching a series of youtube videos by Ravindrababu Ravula.

I am creating my own language for fun and I'm parsing it to an Abstract Syntax Tree (AST). My understanding is that these trees can be portable given they follow the same structure as other languages.

How can I create an AST that will be portable?

Side notes:

  1. My parser is currently written in javascript but I might move it to C#.
  2. I've been looking at SpiderMonkey's specs for guidance. Is that a good approach?
1
Creating it how? Many parser generators will generate AST classes for you. Off topic.user207421
@EJP I provided a link to my code: github.com/christo8989/erl. I'm not sure what a parse generator is or how to use it. Right now my parser is in javascript but I may rewrite it in C#.christo8989
@EJP I reworded my question to be on topic.christo8989
@christo8989: there are no standards for creating ASTs. And there is no standard AST.rici
You might find this answer helpful: stackoverflow.com/a/9456860/120163Ira Baxter

1 Answers

2
votes

Portability (however defined) is not likely to be your primary goal in building an AST. Few (if any) compiler frameworks provide a clear interface which allows the use of an external AST, and particular AST structures tend to be badly-documented and subject to change without notice. (Even if they are well-documented, the complexity of a typical AST implementation is challenging.)

An AST is very tied to the syntactic details of a language, as well as to the particular parsing strategy being used. While it is useful to be able to repurpose ASTs for multiple tasks -- compiling, linting, pretty-printing, interactive editing, static analysis, etc. -- the conflicting demands of these different use cases tends to increase complexity. Particularly at the beginning stages of language development, you'll want to give yourself a lot of scope for rapid prototyping.

The most tempting reason for portable ASTs would be to use some other language as a target, thereby saving the cost of writing code-generation, etc. However, in practice it is usually easier to generate the textual representation of the other language from your own AST than to force your parser to use a foreign AST. Even better is to target a well-documented virtual machine (LLVM, .Net IL, JVM, etc.), which is often not much more work than generating, say, C code.

You might want to take a look at the LLVM Kaleidoscope tutorial (the second section covers ASTs, although implemented in C++). Also, you might find this question on a sister site interesting reading. And finally, if you are going to do your implementation in Javascript, you should at least take a look at the jison parser generator, which takes a lot of the grunt-work out of maintaining a parser and scanner (and thus allows for easier experimentation.)