9
votes

I am new to OCaml and reading the Real World OCaml (RWO) book. Pattern matching is described in chapter 3 and seems weak in comparison to Erlang's (or Prolog's) pattern matching.

My questions are:

  1. why is OCaml's pattern matching weaker?
  2. are there any advantages to OCaml's style of pattern matching?

A concrete example:

The following function (taken from RWO p. 63) destutters a list

let rec destutter list =
    match list with
    | [] -> []
    | [hd] -> [hd]
    | hd :: hd' :: tl ->
      if hd = hd' then ds1 (hd' :: tl)
      else hd :: ds1 (hd' :: tl)
  ;;

# destutter [1;2;3;3;4;5;5;6];;
- : int list = [1; 2; 3; 4; 5; 6]

In Erlang it would be possible (and I think preferred) to use pattern matching instead of the conditional:

destutter([])      -> [];
destutter([X])     -> [X];
destutter([H,H|T]) -> destutter([H|T]);
destutter([H|T])   -> [H | destutter(T)].

Trying this kind of thing in OCaml ...

let rec destutter list =
    match list with
    | [] -> []
    | [hd] -> [hd]
    | hd :: hd :: tl -> destutter tl  (* error *)
    | hd :: tl -> hd :: destutter tl
  ;;

... raises an error on the marked line:

Error: Variable hd is bound several times in this matching

So, Erlang/Prolog-style pattern matching doesn't work in OCaml. Why? What are the advantages of the OCaml approach?

With thanks and best wishes

Ivan

3

3 Answers

14
votes

First, you can always capture equality between pattern variables via a when clause in OCaml, e.g.:

let rec destutter = function
    | []              -> []
    | [hd]            -> [hd]
    | hd :: hd' :: tl
      when hd = hd'   -> destutter (hd :: tl)
    | hd :: hd' :: tl -> hd :: destutter (hd' :: tl)

There is a trade-off here. While Erlang is more expressive, OCaml pattern matching is simpler (which means a simpler language definition, compiler, etc.), and you still can do what you need (at the cost of writing more code).

Note that while you can rewrite a non-linear pattern as a linear pattern with a when clause, that is not the primary issue. More critically, pattern matching then needs to have a notion of equality for arbitrary types in order to support non-linear patterns. This is not an issue in Erlang, but OCaml not only already has = vs == built-in (structural equality vs. identity), but for any given type, it may not be the kind of equality that you need (think strings and case-sensitivity, for example). Then, as a consequence, checking for exhaustiveness or overlap becomes non-trivial. In the end, it's questionable whether providing a special case for one specific type of equality is worth it, given how many useful relations between parts of a pattern there can be. (I note that non-strict languages have additional issues.)

As an aside, Prolog's pattern matching is unification-based and strictly more powerful than either Erlang or OCaml (but also more difficult to implement).

10
votes

Patterns in OCaml are compiled into a very efficient code with lots of sophisticated optimizations. Bjarne Stroustrup even bragged that they managed to write something comparable in certain cases in C++. But in general OCaml pattern matching is much faster. And it's charming to look at the assembly output. And maybe Erlang provides more flexibility, but it is expected from a dynamic language. Otherwise, why use them at all.

There is another issue also. Patterns are matched structurally. If you want to match [H,H|T] you are actually invoking the comparison of the first two elements. In most cases, the comparison function should be provided by a user, and it is not known in advance.

8
votes

The Erlang pattern is more powerful because it can match against something determined at run time. The OCaml patterns match against things fixed at compile time. It follows that the OCaml patterns can probably be made to go faster. I also find OCaml-style patterns easier to reason about.