39
votes

I once asked a question on haskell beginners, whether to use data/newtype or a typeclass. In my particular case it turned out that no typeclass was required. Additionally Tom Ellis gave me a brilliant advice, what to do when in doubt:

The simplest way of answering this which is mostly correct is:

    use data

I know that typeclasses can make a few things a bit prettier, but not much AFIK. It also strikes me that typeclasses are mostly used for brain stem stuff, wheras in newer stuff, new typeclasses hardly ever get introduced and everything is done with data/newtype.

Now I wonder if there are cases where typeclasses are absolutely required and things could not be expressed with data/newtype?

Answering a similar question on StackOverflow Gabriel Gonzales said

Use type classes if:

    There is only one correct behavior per given type

    The type class has associated equations (i.e. "laws") that all instances must satisfy

Hmm ..

Or are typeclasses and data/newtype somewhat competing concepts which coexist for historical reasons?

6

6 Answers

44
votes

I would argue that typeclasses are an essential part of Haskell.

They are the part of Haskell that makes it the easiest language I know of to refactor, and they are a great asset to your being able to reason about the correctness of code.

So, let's talk about dictionary passing.

Now, any sort of dictionary passing is a big improvement in the state of affairs in traditional object oriented languages. We know how to do OOP with vtables in C++. However, the vtable is 'part of the object' in OOP languages. Fusing the vtable with the object forces your code into a form where you have a rigid discipline about who can extend the core types with new features, its really only the original author of the class who has to incorporate all the things others want to bake into their type. This leads to "lava flow code" and all sorts of other design antipatterns, etc.

Languages like C# give you the ability to hack in extension methods to fake new stuff, and "traits" in languages like scala and multiple inheritance in other languages let you delegate some of the work as well, but they are partial solutions.

When you split the vtable from the objects they manipulate you get a heady rush of power. You can now pass them around wherever you want, but then of course you need to name them and talk about them. The ML discipline around modules / functors and the explicit dictionary passing style take this approach.

Typeclasses take a slightly different tack. We rely on uniqueness of a typeclass instance for a given type and it is in large part it is this choice permits us to get away with such simple core data types.

Why?

Because we can move the use of the dictionaries to the use sites, and don't have to carry them around with the data types and we can rely upon the fact that when we do so nothing has changed about the behavior of the code.

Mechanical translation of the code to more complex manually passed dictionaries loses the uniqueness of such a dictionary at a given type. Passing the dictionaries in at different points in your program now leads to programs with greatly differing behavior. You may or may not have to remember the dictionaries your data type was constructed with, and woe betide you if you want to have conditional behavior based on what your arguments are.

For simple examples like Set you can get away with a manual dictionary translation. The price doesn't seem so high. You have to bake in the dictionary for, say, how you want to sort the Set when you make the object and then insert/lookup, would just preserve your choice. This might be a cost you can bear. When you union two Sets now, of course, its up in the air which ordering you get. Maybe you take the smaller and insert it into the larger, but then the ordering would change willy nilly, so instead you have to take say, the left and always insert it into the right, or document this haphazard behavior. You're now being forced into suboptimal performing solutions in the interest of 'flexibility'.

But Set is a trivial example. There you might bake an index into the type about which instance it was you are using, there is only one class involved. What happens when you want more complex behavior? One of the things we do with Haskell is work with monad transformers. Now you have lots of instances floating around -- and you don't have a good place to store them all, MonadReader, MonadWriter, MonadState, etc. may all apply.. conditionally, based on the underlying monad. what happens when you hoist and swap it out and now different things may or may not apply?

Carrying around an explicit dictionaries for this is a lot of work, there isn't a good place to store them and you are asking users to adopt a global program transformation to adopt this practice.

These are the things that typeclasses make effortless.

Do I believe you should use them for everything?

Not by a long shot.

But I can't agree with the other replies here that they are inessential to Haskell.

Haskell is the only language that supplies them and they are critical to at least my ability to think in this language, and are a huge part of why I consider Haskell home.

I do agree with a few things here, use typeclasses when there are laws and when the choice is unambiguous.

I'd challenge however, that if you don't have laws or if the choice isn't unambiguous, you may not know enough about how to model the problem domain, and should be seeking something for which you can fit it into the typeclass mold, possibly even into existing abstractions -- and when you finally find that solution, you'll find you can easily reuse it.

32
votes

Typeclasses are, in most cases, inessential. Any typeclass code can be mechanically converted into dictionary-passing style. They mainly provide convenience, sometimes an essential amount of convenience (cf. kmett's answer).

Sometimes the single-instance property of typeclasses is used to enforce invariants. For example, you could not convert Data.Set into dictionary-passing style safely, because if you inserted twice with two different Ord dictionaries, you could break the data structure invariant. Of course you could still convert any working code to working code in dictionary-passing style, but you would not be able to outlaw as much broken code.

Laws are another important cultural aspect to typeclasses. The compiler does not enforce laws, but Haskell programmers expect typeclasses to come with laws that all the instances satisfy. This can be leveraged to provide stonger guarantees about some functions. This advantage comes only from the conventions of the community, and is not a formal property of a language.

5
votes

To answer that part of the question:

"typeclasses and data/newtype somewhat competing concepts"

No. Typeclasses are an extension to the type system, that allows you to make constraints on polymorphic arguments. Like most things in programming, they are, of course, syntactic sugar [so they aren't essential in the sense that their use can't be replaced by anything else]. That doesn't mean they're superfluous. It just means you could express similar things using other language facilities, but you'd lose some clarity while you're at it. Dictionary passing can be used for mostly the same things, but it's ultimately less strict in the type system because it allows changing behavior at runtime (which is also an excellent example of where you'd use dictionary passing instead of type classes).

Data and newtype still mean exactly the same thing whether you have typeclasses or not: Introduce a new type, in the case of data as new kind of data structure, and in case of newtype as a typesafe variant of type.

3
votes

To expand slightly on my comment I would suggest always starting by using data and dictionary passing. If the boilerplate and manual instance plumbing becomes too much to bear then consider introducing a typeclass. I suspect this approach generally leads to a cleaner design.

2
votes

I just want to make a really mundane point about syntax.

People tend to underestimate the convenience afforded by type classes, probably because they have never tried Haskell without using any. This is a "the grass is greener on the other side of the fence" sort of phenomenon.

while :: Monad m -> m Bool -> m a -> m ()
while m p body = (>>=) m p $ \x ->
                 if x
                 then (>>) m body (while m p body)
                 else return m ()

average :: Floating a -> a -> a -> a -> a
average f a b c = (/) f ((+) (floatingToNum f) a ((+) (floatingToNum f) b c))
                        (fromInteger (floatingToNum f) 3)

This is the historical motivation for type classes and it remains valid today. If we didn't have type classes, we'd certainly need some kind of replacement for it to avoid writing monstrosities like these. (Maybe something like record puns or Agda's "open".)

1
votes

I know that typeclasses can make a few things a bit prettier, but not much AFIK.

Bit prettier?? No! Way prettier! (as others have already noted)

However the answer to this really depends very much where this question comes from.

  • If Haskell is your tool of choice for serious software engineering, typeclasses are powerful and essential.
  • If you are a beginner using haskell to learn (functional) programming, the complexity and difficulty of typeclasses can outweigh the advantages – certainly at the beginning of your studies.

Here are a couple of examples comparing ghc with gofer (predecessor of hugs, predecessor of modern haskell):

gofer

? 1 ++ [2,3,4]
ERROR: Type error in application
*** expression     :: 1 ++ [2,3,4]
*** term           :: 1
*** type           :: Int
*** does not match :: [Int]

Now compare with ghc:

Prelude> 1 ++ [2,3,4]
:2:1:
    No instance for (Num [a0]) arising from the literal `1'
    Possible fix: add an instance declaration for (Num [a0])
    In the first argument of `(++)', namely `1'
    In the expression: 1 ++ [2, 3, 4]
    In an equation for `it': it = 1 ++ [2, 3, 4]

:2:7:
    No instance for (Num a0) arising from the literal `2'
    The type variable `a0' is ambiguous
    Possible fix: add a type signature that fixes these type variable(s)
    Note: there are several potential instances:
      instance Num Double -- Defined in `GHC.Float'
      instance Num Float -- Defined in `GHC.Float'
      instance Integral a => Num (GHC.Real.Ratio a)
        -- Defined in `GHC.Real'
      ...plus three others
    In the expression: 2
    In the second argument of `(++)', namely `[2, 3, 4]'
    In the expression: 1 ++ [2, 3, 4] 

This should suggest that error-message-wise, not only are typeclasses not prettier, they can be uglier!

One can go all the way (in gofer) and use the 'simple prelude' that uses no typeclasses at all. This makes it quite unrealistic for serious programming but real neat for wrapping your head round Hindley-Milner:

Standard Prelude

? :t (==)
(==) :: Eq a => a -> a -> Bool
? :t (+)
(+) :: Num a => a -> a -> a

Simple Prelude

? :t (==)
(==) :: a -> a -> Bool
? :t (+)
(+) :: Int -> Int -> Int