7
votes

I have a problem similar to this:

How to print a tree structure into a string fast in Ocaml?

But in an opposite way, that I already have a string and want to parse it back to be a tree.

For example, I have

type expr = 
  Number of int
 |Plus of expr*expr
 |Prod of expr*expr

and I have a string like 1+2*3+4 (a little different from the link above, assume * has higher procedence than +)
Then I want my result to be a expr type Prod(Plus(1,2), Plus(3, 4))

I found another link that might talk about this, but not sure if it's a way of doing my problem:

Parsing grammars using OCaml

Please share some ideas, thank you.

2

2 Answers

4
votes

This is a standard parsing problem: faced in all compilers / interpreters / etc... There are a number of ways to attack this problem, which basically boil down to the following:

  • Write your own recursive descent parser
  • Use a parser generated from a parser generator

It sounds like what you're working on is something where you need an Abstract Syntax Tree (the "tree" you mention in the problem). You can easily use an OCaml parser generator to accomplish such things, a good one would be Menhir. While you can also write your own parser, using a tool like Menhir or ocamlyacc to do it for you is a very versatile and quick solution (compared to messing around with the yuckiness of dealing with non LL(1) things in a simple recursive descent parser).

4
votes

The OCaml distribution contains ocamlyacc a tool to do exactly what you need (other tools exist, as Kristopher pointed out).

For a nice explanation of ocamlyacc, see for instance this tutorial and the related example where a parser for a small expression language (similar to yours) is defined.