5
votes

EDIT: I replaced abstract example with module type A and B by a more concrete example using groups and rings.

I present my problems with functors on an example using well-known algebraic structure. Define a signature for groups:

module type Group = 
sig
    type t
    val neutral: t
    val add: t -> t -> t
    val opp: t -> t
end

And define a signature for groups that contains many useful operations, e.g. here conjugation:

module type Extended_Group =
sig
    include Group
    val conjugate: t -> t -> t
end

Implementation for conjugation depends only on addition and opposite, so I don't want to write it explicitly for all groups I define, so I write the following functor:

module F (G: Group) =
struct
    include G
    let conjugate x y = add (add x y) (opp x)
end

Now, suppose you are working with other structures that "extend" the notion of group, for instance the ring of integers:

module Z = 
struct
    (* Group *)
    type t = int
    let neutral = 0
    let add x y = x+y
    let opp x = -x
    (* Ring *)
    let m_neutral = 1
    let mult x y = x*y
end

Since a ring is a particle case of group, we can apply functor F to it:

module Extended_Z = F(Z)

Unfortunately, to apply F to Z, Ocaml first restrict the type of Z to Group, and then apply F. Everything behaves exactly as if I did this:

module (Group_Z: Group) = Z
module Extended_Z = F(Group_Z)

And thus, eventhough Z.mult makes sense, Extended_Z.mult is unknown to the compiler. I don't want to lose the fact that Extended_Z is still a ring ! Of course, the solution that consist in writing a functor F_Ring that does just like F but taking a Ring as input is not a satisfying one: I don't want to write F_Fields, F_VectorSpace, and so on.

Here is what I would want:

  1. Either a way to extend the signature of a module, to expose more values. It is possible to do it the other way, restricting the signature of a module (with the syntax I used for Group_Z), but I can't find a way to enlarge a signature. Here it would be something like:

    module (Ring_Extended_Z: Ring) = Extended_Z
    
  2. Or, a way to define polymorphic functors, that is defining a functor F that takes a module of signature "at least Group" and output a module of signature "at least Extended_Group". Of course, this makes sense only in presence of include, which is why I strongly believe such a feature does not exist in OCaml.

After searching for several days for answers on the Internet, I think the correct way to achieve this is actually the following one: (Inspired from chapter 9 of Real World OCaml)

module F (G: Group) =
struct
    let conjugate x y = ...
end

(same as previously but without the include) and then perform all the includes at the same point:

module Extended_Z =
struct
    include Z
    include F(Z)
end

Can someone confirm that this is the good way to achieve what I want, and that my first approach does not fit the spirit of OCaml ? More specifically, can someone confirm that options 1. and 2. are indeed impossible in OCaml ?

EDIT: Concerning dynamic typing

It seems to me that this is still static typing, since if I write

module Extended_Z = F(Z: Ring)

OCaml could still type this statically as a module that has signature Ring with the extra value "conjugate". It requires polymorphism at the module level, which I understand does not exist, but I think it could exist without making the type system dynamic.

2
afaict functors are always monomorphic. they have a definite type that is known at compile time. therefore e.g. a functor defined by Id (A) = struct include A end cannot be made to accept any module A, since the input type has to be known. see here: caml.inria.fr/pub/docs/manual-ocaml/modules.html#sec211 . if you look up the definition of a module type you will find that there is no equivalent of a polymorphic variant for modules, i.e. no notion of 'at least these values' for a module type.user3240588
I completely understand that modules are monomorphic, but the include command gives the impression that such a functor Id can be made. Define module type Empty = sig end, and module Id (A: Empty) = struct include A end; you will be able to apply functor Id to any module A, at the price of forgetting everything that A contains.Simon Halfon
i see. i agree that include would make it appear that you can include just anything but in fact what you include must appear as a functor argument with a definite signature (as you know). anyway, see the proper answers belowuser3240588

2 Answers

2
votes

I share in this answer the two partially-satisfying solution I have found to implement what I wanted.

The first one is the one suggested at the end of my question, and in the case of algebraic structure is probably the "good" one.

module type Group = 
sig
    type t
    val neutral: t
    val add: t -> t -> t
    val opp: t -> t
end

module type Extended_Group =
sig
    include Group
    val conjugate: t -> t -> t
end

module F (G: Group) =
struct
    let conjugate x y = G.add (G.add x y) (G.opp x)
end

module type Ring = 
sig
    include Group
    val m_neutral: t
    val mult: t -> t -> t
end

module Z = 
struct
    type t = int
    let neutral = 0
    let add x y = x+y
    let opp x = -x
    (* Ring *)
    let m_neutral = 1
    let mult x y = x*y
end

module Extended_Z = 
struct
    include Z
    include F(Z)
end

In this case, the signature of Extended_Z inferred by OCaml "extends" the signature Ring, and indeed Extended_Z.mult is visible outside the module.

The second solution uses pieces of what Andreas Rossberg and Ivg suggested to simulate polymorphism at the module level. It is less satisfying in the case of algebraic structure, but for my actual problem it is just perfect.

module type Group_or_More =
sig
    include Group
    module type T
    module Aux: T
end

module F (G: Group_or_More) =
struct
    include G
    let conjugate x y = add (add x y) (opp x)
end


module Z = 
struct
    type t = int
    let neutral = 0
    let add x y = x + y
    let opp x = -x
    module type T = 
        sig
            val m_neutral: t
            val mult: t -> t -> t
        end
    module Aux = 
        struct
            let m_neutral = 1
            let mult x y = x*y
        end
end


module Extended_Z = F(Z) ;;

Here, the signature Group_or_More captures exactly this idea that I had of a signature extending another. You put everything extra into the auxiliary module Aux. But, in order to be able to provide as many auxiliary functions as one wants, you specify the signature of this Aux module as part of your group.

For algebraic structure, it is less satisfying since Z and Extended_Z are rings in a weaker sense: multiplication is accessed by Z.Aux.mult and Extended_Z.Aux.mult.

2
votes

Functors are abstractions over structures, that produce structures from structures. In fact, they are categorical morphisms, i.e., arrows. This is a very powerful mechanism, but still has a restriction: all parameters of a module must have a fixed (constant) module type, i.e., there are no module type variables. That means, that it is impossible to get what you want and sometimes you will need to write more code, than you expect.

Your example actually captures an idea of least (minimal) structure and greatest (maximal). The former distills the minimal requirements for an algebraic structure, while the latter describes a maximal set of elements, that can be derived from it (where elements are represented as structure fields). Compare it with Haskell's type classes, where you provide a minimal implementation and the rest is derived by a type class definition. The same, can be expressed with functors, but unlike Haskell there is no machinery that will apply this functors automagically, you need to instantiate all functors manually. This is in line with OCaml ideology - explicit is better than implicit. (Although, the upcoming modular implicits will diverge from this ideology by providing similar inference mechanism as Haskell's type classes).

But enough words, let's do some coding. As I said, your example, poses a great example for minimal and maximal structures. We will start by defining minimal requirements for corresponding algebraic structures trying to follow the definitions closely:

module Min = struct
  module type Set = sig
    type t
    val compare : t -> t -> int
  end

  module type Monoid = sig
    include Set
    val zero  : t
    val (+) : t -> t -> t
  end

  module type Group = sig
    include Monoid
    val (~-) : t -> t
  end

  module type Ring = sig
    include Group
    val one : t
    val ( * ) : t -> t -> t
    val (~/) : t -> t
  end
  ...
end

We put all module type definitions into module Min, that denotes, that they are minimal. These are only module types, i.e., signatures.

Now, we can define a maximal signatures. Here a problem is that it is usually hard to stop, i.e., to find the greatest fixpoint. But this is not a big problem, as you may continue to write functors that takes smaller structures and returns bigger until you're satisfied with a result.

For example, we can infer lots of things just from having type t and the compare function:

module Max = struct
  module type Set = sig
    include Min.Set
    val (=) : t -> t -> t
    val (<) : t -> t -> t
    val (>) : t -> t -> t
    module Set : Set.S  (* this is Set from stdlib, not ours *)
    module Map : Map.S
    (* and much more *)
  end

  module Ring = sig
    include Min.Ring
    ...
  end
  ...
end

Now, when we have a nice theory of structures, we can move to the implementation. We can provide a generic morphisms Min.T -> Max.T for all signatures T, e.g.,

module Set(S : Min.Set) : Max.Set with type t = S.t = struct
  include S
  let (<) x y = compare x y = -1
  ...
  module Set = Set.Make(S)
  module Map = Map.Make(S)
  ...
end

Using these generic functors as basic instruments, and our minimal definitions we can build a structure for a real type, that we're defining. Moreover, since generic functors are often not able to provide optimal implementation for some operations, we can provide our own, that will leverage our knowledge of the implementation. For example:

module Z = struct
  module T = struct
    type t = float
    let compare = compare
    let one = 1.
    let zero = 0.
    let (+) = (+.)
    let (~-) = (~-.)
    let (~/) x = (1. /. x)
  end
  include Set(T)
  include Ring(T)
  (* override minus and division as we can implement them more efficiently *)
  let (-) = (~.)
  let (/) = (/.)
end

And we can describe a module type of our concrete Z module very easily, by just including corresponding maximal signatures plus adding our specific stuff:

module type Z = sig
  include Ring
  val read : in_channel -> t
  val write : out_channel -> t -> unit
end

About module level polymorphism

It was noted in the comments by Andreas Rossberg, that OCaml module sub-language actually allows polymorphism at least syntactically. This is a system F style polymorphism that must be introduced explicitly by L term, a term that binds types instead of values. Since modules can carry types, we can express L term with a module that has the following module type:

module type L = sig module type T end

Now, we can define classical polymorphic functions, id, fst and snd:

module Id(L:L)(X:L.T) = X
module Fst(L1:L)(L2:L)(M1 : L1.T)(M2 : L2.T) = M1
module Snd(L1:L)(L2:L)(M1 : L1.T)(M2 : L2.T) = M2

However, it looks like that we're very limited in what we can do with our parameters in the right hand side of a functor definition. For example, we cannot combine two modules:

module Sum(L1:L)(L2:L)(M1 : L1.T)(M2 : L2.T) = struct
  include M1 
  ^^^^^^^^^^
  Error: This module is not a structure; it has type L1.T
  include M2
end

We also can't open modules, as we're getting the same error. The problem is that include and open statements are evaluating a module type variable to its value, expecting to get a signature. Since a typing environment doesn't yet have a binding for the L1.T we got a, kind of misleading, error message. Given this limitation, we can only work with parameters on the identifier level. For example, we still can define the Sum module as:

module Sum(L1:L)(L2:L)(M1 : L1.T)(M2 : L2.T) = struct
  module X = M1
  module Y = M2
end

It is very questionable, whether it has any practical usage, though. Another example would be to define App functor, that will apply a functor to the argument (also of dubious usability):

module App(L1:L)(L2:L)
 (F : functor (X : L1.T) -> L2.T)(X:L1.T) = struct
  module R = F(X)
end