1
votes

I have a recursive function in OCaml that takes in an expression and returns an array of instructions (this is for a compilers project).

The code basically looks something like

let rec compile_body (body:expr):fn = 
  match body with 
  | Seq (expr1, expr2) -> Array.concat [compile_body expr1; compile_body expr2;]
  | Add (n1, n2) -> [|Load 0 n1; Load 1 n2; Add 0 0 1;|]
  | ... 

Would it be more performant to write the function as returning a list, and then at the very end, converting the list to an array?

2
You could also keep a tree of arrays, which would represent the logical concatenation of all arrays.coredump

2 Answers

1
votes

Array.concat or Array.append, according to the documentation, allocate a new array then copy the content of the two initial arrays in the later.

List.append complexity is proportional to the length of the first argument only.

So using list should theoretically be more efficient as your last array concatenation would anyway have the same complexity as the final Array.of_list operation.

edit : As @coredump mentionned, keeping a tree of arrays, which would represent the logical concatenation of all arrays, would be the cheaper structures, and if you nead an actual array at the end, you could then compute the overall size of your "abstract" array, create an array of this size, and then fill it with the content of the tree of array. The operation would be linear in the size of the final array.

This being said, I doubt that it will make a big difference at the end unless you have huge expression to compile. In wich case you should also pay attention to the fact that List.append is not tail recursive and thus might leads to stack overflow on (really) big lists.

1
votes

Using an array will copy the whole thing over and over. That quickly becomes too slow.

Using a list with append easily has the same problem, or even worse. Never append something to a long list. Order your code so it appends to a short list. rev_append can also be useful. But in your case a Seq (a, b) can have long lists for a and b so appending is a bad idea.

The solution is to use lists but to not use append. Do a right to left traversal of your syntax tree and build the list starting at the end. That way all operations only add to the front of the list.

Alternatively do a left to right traversal to build the list backwards and at the end reverse it. This is often easier to read as you construct things in the order one would expect.