1
votes

This is an open question but I never managed to find a solution that pleases me.

Let's say I have this algebraic data type :

type t = A of int | B of float | C of string

Now, let's say I want to write a compare function - because I want to put my values in a Map, for example - I'd write it like this :

let compare t1 t2 =
  match t1, t2 with
    | A x, A y -> compare x y
    | A _, _ -> -1
    | _, A _ -> 1
    | B x, B y -> compare x y
    | B _, _ -> -1
    | _, B _ -> 1
    | C x, C y (* or _ *) -> compare x y

Or I could write it like this :

let compare t1 t2 = 
  match t1, t2 with
    | A x, A y -> compare x y
    | B y, B x -> compare x y
    | C x, C y -> compare x y
    | A _, _
    | B _, C _ -> -1
    | _ -> 1

If I'm not wrong, saying that n is the number of constructors, the first compare will have 3 * (n - 1) + 1 cases and the second one will have n + ((n - 2) * (n - 1)) / 2 + 2 cases.

This is pretty unsatisfying since :

  • n = 3 (our case) : 7 or 6 cases
  • n = 4 : 10 or 8 cases
  • n = 5 : 13 or 13 cases

It grows pretty fast.

So, I was wondering, do you do it like I do or do you use another method ?

Or, even better, is there the possibility of doing something like

let compare t1 t2 =
  match t1, t2 with
    | c1 x, c2 y -> 
      let c = compare c1 c2 in
      if c = 0 then compare x y else c

Or,

let compare (c1 x) (c2 y) = 
  let c = compare c1 c2 in
  if c = 0 then compare x y else c

Edit : added a compare if the two constructors are equal for señor Drup (from Guadalup ;-P)

3
Except your comparison function is wrong, since it will say that A 1 and A 2 are equals.Drup
Yes, but this won't be a problem, don't worry for that. "wrong" is just a point of view ;-) I edited my question to please you ;-)Lhooq

3 Answers

6
votes

You can use ppx_deriving to generate this function.

The following will create a function compare : t -> t -> int that does the right thing:

type t = A of int | B of float | C of string [@@deriving ord]

If you are curious, or cannot use ppx_deriving, here is the generated code, which uses a similar strategy as Reimer's solution.

% utop -require ppx_deriving.std -dsource
utop # type t = A of int | B of float | C of string [@@deriving ord];;
type t = | A of int | B of float | C of string [@@deriving ord]
let rec (compare : t -> t -> Ppx_deriving_runtime.int) =
  ((let open! Ppx_deriving_runtime in
      fun lhs  ->
        fun rhs  ->
          match (lhs, rhs) with
          | (A lhs0,A rhs0) -> Pervasives.compare lhs0 rhs0
          | (B lhs0,B rhs0) -> Pervasives.compare lhs0 rhs0
          | (C lhs0,C rhs0) -> Pervasives.compare lhs0 rhs0
          | _ ->
              let to_int = function
              | A _ -> 0
              | B _ -> 1
              | C _ -> 2
              in
              Pervasives.compare (to_int lhs) (to_int rhs))
  [@ocaml.warning "-A"]) ;;
type t = A of int | B of float | C of string
val compare : t -> t -> int = <fun>
4
votes

There are a couple of alternatives. First, you can use the Obj module to compare the representations directly (though that is obviously implementation-dependent):

type t = A of int | B of float | C of string

let compare_t a b =
  match (a, b) with
  | A x, A y -> compare x y
  | B x, B y -> compare x y
  | C x, C y -> compare x y
  | _ -> compare (Obj.tag (Obj.repr a)) (Obj.tag (Obj.repr b))

If you want it to be more portable, you can write (or generate) a function that gives you the numeric tag. As it turns out, the current OCaml compiler seems to be looking for that and appears to be capable of eliding the underlying function call.

let tag_of_t = function
  | A _ -> 0
  | B _ -> 1
  | C _ -> 2

let compare_t a b =
  match (a, b) with
  | A x, A y -> compare x y
  | B x, B y -> compare x y
  | C x, C y -> compare x y
  | _ -> compare (tag_of_t a) (tag_of_t b)

You still have to deal with linear growth, just not quadratic growth.

1
votes

In case you only need compare for using the Map.Make functor, i.e. you do not care about the specific order, you can use the built-in compare. In particular it works for the type given in your example:

let compare t1 t2 = compare t1 t2

If some, but not all of your cases are outside of the scope for the built-in compare (for example they are of function type), you can still treat the remaining cases in O(1) code size. Example:

type t = A of int -> int | B of float | C of string

let compare t1 t2 = match t1,t2 with
| A f1, A f2 -> ...
| A _, _ -> -1
| _, A _ -> 1
| _, _ -> compare t1 t2

If all that fails, there is still the option of doing meta-programming (for example through camlp5) to automatically create an otherwise longish compare. I have done that in the past, because otherwise the order of e.g.

type three = Zero | One | Two

is unspecified (Pervasives.compare is only specified to be some total order) and I wanted the natural order.