20
votes

Now that we have injective type families, is there any remaining use case for using data families over type families?

Looking at past StackOverflow questions about data families, there is this question from a couple years ago discussing the difference between type families and data families, and this answer about use cases of data families. Both say that the injectivity of data families is their greatest strength.

Looking at the docs on data families, I see reason not to rewrite all uses of data families using injective type families.

For example, say I have a data family (I've merged some examples from the docs to try to squeeze in all the features of data families)

data family G a b
data instance G Int Bool = G11 Int | G12 Bool deriving (Eq)
newtype instance G () a = G21 a
data instance G [a] b where
   G31 :: c -> G [Int] b
   G32 :: G [a] Bool

I might as well rewrite it as

type family G a b = g | g -> a b
type instance G Int Bool = G_Int_Bool
type instance G () a = G_Unit_a a
type instance G [a] b = G_lal_b a b

data G_Int_Bool = G11 Int | G12 Bool  deriving (Eq)
newtype G_Unit_a a = G21 a
data G_lal_b a b where
   G31 :: c -> G_lal_b [Int] b
   G32 :: G_lal_b [a] Bool

It goes without saying that associated instances for data families correspond to associated instances with type families in the same way. Then is the only remaining difference that we have less things in the type-namespace?

As a followup, is there any benefit to having less things in the type-namespace? All I can think of is that this will become debugging hell for someone playing with this on ghci - the types of the constructors all seem to indicate that the constructors are all under one GADT...

3
I thought that this could be done with data families and not type families, but it doesn't even work with data families - you can declare such a data family but attempting to declare any instances causes an error.user2407038
@user2407038 I'm not sure I understand - the example obviously works with type families (as the link you've given says), and I can also get it to work with data families, but I don't see the benefit. (data family T :: a -> *; data instance T 42 = MkNat; data instance T "Don't panic!" = MkSymbol)Alec
Those are different type altogether; the example in the user guide is about the equality constraint (IsTypeLit a ~ True) in the type constructor (not value constructor). You cannot have any constraints on the types of a type family, and you can declare data family T :: forall a. (IsTypeLit a ~ 'True) => a -> * but never any instances of T.user2407038
I suspect the best way to find something data families can do and type families can't is to dig into why type families can be promoted with TypeInType but data families can't.dfeuer

3 Answers

20
votes
type family T a = r | r -> a
data family D a

An injective type family T satisfies the injectivity axiom

if T a ~ T b then a ~ b

But a data family satisfies the much stronger generativity axiom

if D a ~ g b then D ~ g and a ~ b

(If you like: Because the instances of D define new types that are different from any existing types.)

In fact D itself is a legitimate type in the type system, unlike a type family like T, which can only ever appear in a fully saturated application like T a. This means

  • D can be the argument to another type constructor, like MaybeT D. (MaybeT T is illegal.)

  • You can define instances for D, like instance Functor D. (You can't define instances for a type family Functor T, and it would be unusable anyway because instance selection for, e.g., map :: Functor f => (a -> b) -> f a -> f b relies on the fact that from the type f a you can determine both f and a; for this to work f cannot be allowed to vary over type families, even injective ones.)

5
votes

You're missing one other detail - data families create new types. Type families can only refer to other types. In particular, every instance of a data family declares new constructors. And it's nicely generic. You can create a data instance with newtype instance if you want newtype semantics. Your instance can be a record. It can have multiple constructors. It can even be a GADT if you want.

It's exactly the difference between the type and data/newtype keywords. Injective type families don't give you new types, rendering them useless in the case where you need that.

I understand where you're coming from. I had this same issue with the difference initially. Then I finally ran into a use case where they're useful, even without a type class getting involved.

I wanted to write an api for dealing with mutable cells in a few different contexts, without using classes. I knew I wanted to do it with a free monad with interpreters in IO, ST, and maybe some horrible hacks with unsafeCoerce to even go so far as shoehorning it into State. This wasn't for any practical purpose, of course - I was just exploring API designs.

So I had something like this:

data MutableEnv (s :: k) a ...

newRef :: a -> MutableEnv s (Ref s a)
readRef :: Ref s a -> MutableEnv s a
writeRef :: Ref s a -> a -> MutableEnv s ()

The definition of MutableEnv wasn't important. Just standard free/operational monad stuff with constructors matching the three functions in the api.

But I was stuck on what to define Ref as. I didn't want some sort of class, I wanted it to be a concrete type as far as the type system was concerned.

Then late one night I was out for a walk and it hit me - what I essentially want is a type whose constructors are indexed by an argument type. But it had to be open, unlike a GADT - new interpreters could be added at will. And then it hit me. That's exactly what a data family is. An open, type-indexed family of data values. I could complete the api with just the following:

data family Ref (s :: k) :: * -> *

Then, dealing with the underlying representation for a Ref was no big deal. Just create a data instance (or newtype instance, more likely) whenever an interpreter for MutableEnv is defined.

This exact example isn't really useful. But it clearly illustrates something data families can do that injective type families can't.

4
votes

The answer by Reid Barton explains the distinction between my two examples perfectly. It has reminded me of something I read in Richard Eisenberg's thesis about adding dependent types to Haskell and I thought that since the heart of this question is injectivity and generativity, it would be worth mentioning how DependentHaskell will deal with this (when it eventually gets implemented, and if the quantifiers proposed now are the ones eventually implemented).

What follows is based on pages 56 and 57 (4.3.4 Matchability) of the aforementioned thesis:

Definition (Generativity). If f and g are generative, then f a ~ g b implies f ~ g

Definition (Injectivity). If f is injective, then f a ~ f b implies a ~ b

Definition (Matchability). A function f is matchable iff it is generative and injective

In Haskell as we know it now (8.0.1) the matchable (type-level) functions consist exactly of newtype, data, and data family type constructors. In the future, under DependentHaskell, one of the new quantifiers we will get will be '-> and this will be used to denote matchable functions. In other words, there will be a way to inform the compiler a type-level function is generative (which currently can only be done by making sure that function is a type constructor).