I am implementing Algorithm W (the Hindley-Milner type system) in JavaScript:
The function which implements the above rules is typecheck
and it has the following signature:
typecheck :: (Context, Expr) -> Monotype
It is defined as follows:
function typecheck(context, expression) {
switch (expression.type) {
case "Var":
var name = expression.name;
var type = context[name];
return inst(type);
case "App":
var fun = typecheck(context, expression.fun);
var dom = typecheck(context, expression.arg);
var cod = new Variable;
unify(fun, abs(dom, cod));
return cod;
case "Abs":
var param = expression.param;
var env = Object.create(context);
var dom = env[param] = new Variable;
var cod = typecheck(env, expression.result);
return abs(dom, cod);
case "Let":
var assignments = expression.assignments;
var env = Object.create(context);
for (var name in assignments) {
var value = assignments[name];
var type = typecheck(context, value);
env[name] = gen(context, type);
}
return typecheck(env, expression.result);
}
}
A little bit about the data types:
A context is an object which maps identifiers to polytypes.
type Context = Map String Polytype
An expression is defined by the following algebraic data type:
data Expr = Var { name :: String } | App { fun :: Expr, arg :: Expr } | Abs { param :: String, result :: Expr } | Let { assignments :: Map String Expr, result :: Expr } | Rec { assignments :: Map String Expr, result :: Expr }
In addition, we have the following functions which are required by the algorithm but are not essential to the question:
inst :: Polytype -> Monotype
abs :: (Monotype, Monotype) -> Monotype
gen :: (Context, Monotype) -> Polytype
The inst
function specializes a polytype and the gen
function generalizes a monotype.
Anyway, I want to extend my typecheck
function to allow recursive definitions as well:
Where:
However, I am stuck with a chicken and egg problem. Context number one has the assumptions v_1 : τ_1, ..., v_n : τ_n
. Furthermore, it implies e_1 : τ_1, ..., e_n : τ_n
. Hence, you first need to create the context in order to find the types of e_1, ..., e_n
but in order to create the context you need to find the types of e_1, ..., e_n
.
How do you solve this problem? I was thinking of assigning new monotype variables to the identifiers v_1, ..., v_n
and then unifying each monotype variable with its respective type. This is the method that OCaml uses for its let rec
bindings. However, this method does not yield the most general type as is demonstrated by the following OCaml code:
$ ocaml
OCaml version 4.02.1
# let rec foo x = foo (bar true)
and bar x = x;;
val foo : bool -> 'a = <fun>
val bar : bool -> bool = <fun>
However, GHC does compute the most general type:
$ ghci
GHCi, version 7.10.1: http://www.haskell.org/ghc/ :? for help
Prelude> let foo x = foo (bar True); bar x = x
Prelude> :t foo
foo :: Bool -> t
Prelude> :t bar
bar :: t -> t
As you can see, OCaml infers the type val bar : bool -> bool
while GHC infers the type bar :: t -> t
. How does Haskell infer the most general type of the function bar
?
I understand from @augustss' answer that type inference for recursive polymorphic functions is undecidable. For example, Haskell cannot infer the type of the following size
function without additional type annotations:
data Nested a = Epsilon | Cons a (Nested [a])
size Epsilon = 0
size (Cons _ xs) = 1 + size xs
If we specify the type signature size :: Nested a -> Int
then Haskell accepts the program.
However, if we only allow a subset of algebraic data types, inductive types, then the data definition Nested
becomes invalid because it is not inductive; and if I am not mistaken then the type inference of inductive polymorphic functions is indeed decidable. If so, then what is the algorithm used to infer the type of polymorphic inductive functions?