18
votes

I am currently trying to wrap my head around typeclasses and instances and I don't quite understand the point of them yet. I have two questions on the matter so far:

1) Why is it necessary to have a type class in a function signature when the function uses some function from that type class. Example:

f :: (Eq a) => a -> a -> Bool
f a b = a == b

Why put (Eq a) in the signature. If == is not defined for a then why not just throw the error when encountering a == b? What is the point in having to declare the type class ahead?

2) How are type classes and function overloading related?

It is not possible to do this:

data A = A
data B = B

f :: A -> A
f a = a

f :: B -> B
f b = b

But it is possible to do this:

data A = A
data B = B

class F a where
  f :: a -> a

instance F A where
  f a = a

instance F B where
  f b = b

What is up with that? Why can't I have two functions with the same name but operating on different types... Coming from C++ I find that very strange. But I probably have wrong conceptions about what these things really are. but once I wrap them in these type class instance thingies I can.

Feel free to hurl category or type theoretical words at me as well, as I am learning about these subjects in parallel to learning Haskell and I suspect there is a theoretical basis in these for how Haskell does things here.

4
"just throw the error" at runtime is exactly what statically typed languages try to avoid. A main point of having a type system is detecting certain kinds of errors early, at compile time.chi
No, I mean throw the error when encountering the undefined function during compilation.lo tolmencre
If I write f :: a -> a -> Bool, and later apply it to something that can't be compared for equality, say, f (1+) (1-), how should the compiler detect that we now need to have an (==) defined in order to throw an error? Answer: the compiler records a small fact about f that it applies (==) to its arguments, so that it knows when it sees calls to f to check that (==) is available. How do we notate that fact? Answer: we put it in the type, as f :: Eq a => a -> a -> Bool.Daniel Wagner
@lotolmencre Ah, maybe I'm starting to see your point. It's possible that you are confused since you are used to C++ templates. C++ templates are more-or-less compiled (including type checking) for every instantiation of the template arguments. Haskell's (bounded-)polymorphic functions are only compiled once, and there is no "template instantiation" (similar to e.g. Java's generics). In C++, you need header files to contain template sources. In Haskell, you don't need the source after compilation.chi
If I had to make a forced comparison, typeclasses are vaguely related to C++20 concepts, except that in Haskell there's no SFINAE: the function is compiled early (unlike templates), and constraints must guarantee that the function definition is OK for all types (providing the constraints).chi

4 Answers

38
votes

I agree with much of Willem Van Onsem’s answer, but I think it overlooks one of the principal advantages of typeclasses over truly ad-hoc overloading: abstraction. Imagine we used ad-hoc overloading instead of typeclasses to define the Monad operations:

-- Maybe
pure :: a -> Maybe a
pure = Just

(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
Just x >>= f = f x
Nothing >>= _ = Nothing

-- Either
pure :: a -> Either e a
pure = Right

(>>=) :: Either e a -> (a -> Either e b) -> Either e b
Right x >>= f = f x
Left err >>= _ = Left err

Now, we know that every monad can be expressed in terms of pure and >>=, as above, but we also know that they can be equivalently expressed using fmap, pure, and join. Therefore, we should be able to implement a join function that works on any monad:

join x = x >>= id

However, now we have a problem. What is join’s type?

Clearly, join has to be polymorphic, since it works on any monad by design. But giving it the type signature forall m a. m (m a) -> m a would obviously be wrong, since it doesn’t work for all types, only monadic ones. Therefore, we need something in our type that expresses the need for the existence of some operation (>>=) :: m a -> (a -> m b) -> m b, which is exactly what the typeclass constraint provides.

Given this, it becomes clear that ad-hoc overloading makes it possible to overload names, but it is impossible to abstract over those overloaded names because there is no guarantee the different implementations are related in any way. You could define monads without typeclasses, but then you couldn’t define join, when, unless, mapM, sequence, and all the other nice things that you get for free when you define just two operations.

Therefore, typeclasses are necessary in Haskell to enable code reuse and to avoid enormous amounts of duplication. But could you have both typeclass-style overloading and type-directed, ad-hoc name overloading? Yes, and in fact, Idris does. But Idris’s type inference is very different from Haskell’s, so it’s more feasible to support than it is in Haskell for many of the reasons in Willem’s answer.

18
votes

In short: because that is how Haskell was designed.

Why put (Eq a) in the signature. If == is not defined for a then why not just throw the error when encountering a == b?

Why do we put the types in the signature of a C++ program (and not just somewhere as an assertion in the body)? Because that is how C++ is designed. Typically a concept on what programming languages are built is "make explicit what needs to be explicit".

It is not said that a Haskell module is open-source. So that means we only have the signature available. It would thus mean that when we for instance write:

Prelude> foo A A

<interactive>:4:1: error:
    • No instance for (Eq A) arising from a use of ‘foo’
    • In the expression: foo A A
      In an equation for ‘it’: it = foo A A

We would frequently write foo here with types that have no Eq typeclass. As a result, we would get a lot of errors that are only discovered at compile time (or if Haskell was a dynamic language, at runtime). The idea of putting Eq a in the type signature is that we can look up the signature of foo in advance, and thus ensure that the types are instance of the typeclass.

Note that you do not have to write type signatures yourself: Haskell can typically derive the signature of a function, but a signature should include all the necessary information to call and use a function effectively. By adding type constraints, we speed up development.

What is up with that? Why can't I have two functions with the same name but operating on different types.

Again: that is how Haskell is designed. Functions in functional programming languages are "first class citizens". It means these usually have a name and we want to avoid name clashes as much as possible. Just like classes in C++ typically have a unique name (except for namespaces).

Say you would define two different functions:

incr :: Int -> Int
incr = (+1)

incr :: Bool -> Bool
incr _ = True

bar = incr

Then which incr would bar have to select? Of course we can make the types explicit (i.e. incr :: Bool -> Bool), but usually we want to avoid that work, since it introduces a lot of noise.

Another good reason why we do not do that, is because typically a typeclass is not just a collection of functions: it adds contracts to these functions. For instance the Monad typeclass has to satisfy certain relations between the functions. For example (>>= return) should be equivalent with id. In other words, the typeclass:

class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b
    return :: a -> m a

Does not describes two independent functions (>>=) and return: this is a set of functions. You have them both (usually with some contracts between the specific >>= and return), or none of these at all.

6
votes

This only answers question 1 (directly, at least).

The type signature f :: a -> a -> Bool is shorthand for f :: forall a. a -> a -> Bool. f would not truly work for all types a if it only works for as that have (==) defined. This restriction to types that have (==) is expressed using the constraint (Eq a) in f :: forall a. (Eq a) => a -> a -> Bool.

"For all"/universal quantification is at the core of Haskell's (parametric) polymorphism and, among other things, provides the powerful and important properties of parametricity.

1
votes

Haskell holds to two axioms (among others):

  1. Every variable is useable as an expression in its own right;
  2. Every expression has a type that precisely specifies what you are allowed to do with it.

If you had

f :: A -> A

and

f :: B -> B

then, according to the principles adopted in Haskell, f would still be a valid expression, on its own which would still have to have a single type. While it's possible to do that using subtyping, it was deemed much more complicated than the type-class solution.

Similarly, the need for Eq a in

(==) :: Eq a => a -> a -> Bool

comes from the fact that the type of == has to completely describe what you can do with it. If you can only call it at certain types, the type signature has to reflect that.