4
votes

I'm writing a typeclass to add type reflection to Haskell data types. Part of it looks like this:

type VarName a = Text


class Reflective a where
   -- | A reflective type may have a fixed name, or it may be a union 
   -- where each variant has a distinct name.
   reflectiveName :: a -> VarName a
   -- | One default value of the type for each reflective name.
   reflectiveDefaults :: [a]

The idea is that if I write

data ExampleType = Foo Int | Bar String

then in the Reflective instance reflectiveName will return either "Foo" or "Bar" as appropriate, and reflectiveDefaults will return [Foo 0, Bar ""]

So now I can write a function to give me all the names of the variants:

reflectiveNames :: (Reflective a) => [VarName a]
reflectiveNames = map reflectiveName reflectiveDefaults

and I can call it like this:

exampleNames = reflectiveNames :: [VarName ExampleType]

When I compile this I get the following error in the type declaration for reflectiveNames:

• Could not deduce (Reflective a0)
  from the context: Reflective a
    bound by the type signature for:
               reflectiveNames :: Reflective a => [VarName a]
  The type variable ‘a0’ is ambiguous

However if I replace the VarName with a newtype:

newtype VarName a = VarName Text

then it works.

Is this a feature of the Haskell type system, or is it a bug in GHC? And if the former, why is it inventing a new type variable a0?

1
Well newtype actually is a data with a constructor that takes one parameter. As a result this is a new type, wherea type is only a type alias. For the type alias the a does not matter whether Reflective holds, hence the confusion. Whereas for the newtype, you can define instances yourself and thus "tag" the as for which it is allowed.Willem Van Onsem
To make it more clear: say you use reflectiveNames with Text, then how is Haskell supposed to know what a to use? Whereas if you define a VarName type, then the a has to be passed (and can for a real function application be grounded) to a type, such that it can be checked if the instance constraint holds.Willem Van Onsem
@WillemVanOnsem I would have expected that if I said reflectiveNames :: [Text] then the ambiguity error would have appeared there, rather than in the declaration of reflectiveNames.Paul Johnson
well you basically have written, behind the curtains, reflectiveNames :: Reflective a => [Text], since type is not a new type, but a type alias. And that thus means that Haskell can not derive what a is for the given type of the function, and this constructs ambiguity. What reflectiveName is it supposed to use? That for a ~ Char, or a ~ Int, or a ~ (), etc.? This might sounds like a detail, but all these instances can have completely different semantics, if the compiler thus had the freedom to simply pick one, then depending on the system, reflectiveNames makes sense.Willem Van Onsem
@WillemVanOnsem If you can write this up as an answer then I'll accept it.Paul Johnson

1 Answers

2
votes

Why this fails for a type...

If you write type then you do not construct a new type, you construct an alias. So you defined:

type VarName a = Text

Hence now each time you write VarName SomeType, you have basically written Text. So VarName Char ~ VarName Int ~ Text (the tilde ~ means that the two types are equal).

Type aliases are useful however since they typically minimize the code (frequently the name of the alias is shorter than its counterpart), it reduces the complexity of signatures (one does not have to remember a large hierarchy of types), and finally it can be used if some types are not yet fully decided (for example time can be modelled as an Int32, Int64, etc.) and we want to define some placeholder to easily change a large number of signatures.

But the point is that each time you write a VarName Char for example, Haskell will replace this with Text. So now if we take a look at your function, you have written:

reflectiveNames :: Reflective a => [Text]
reflectiveNames = map reflectiveName reflectiveDefaults

Now there is a problem with this function: there is a type variable a (in Reflective a), but nowhere in the signature do we use this type parameter. The problem is that Haskell does not know what a to fill in in case you call this function, and that is a real problem (here), since the semantics of reflectiveName and reflectiveDefaults can be completely different for an a ~ Char, then for an a ~ Int. The compiler can not just "pick" a type for a, since that would mean that two different Haskell compilers, could end up with functions that generate different output, and thus a different program (usually one of the fundamental desired aspects of a programming language is unambiguity, the fact that there are no two semantically different programs that map on the same source code).

... and why it works for a newtype

Now why doesn't this happen if we use newtype? Basically a newtype is the same as a data declaration, except for some small details: behind the curtains for example, Haskell will not produce such constructor, it will simply store the value that is wrapped inside the constructor but it will see the value as a different type. The newtype definition

newtype VarName a = VarName Text

is thus (conceptually) almost equivalent to:

data VarName a = VarName Text

Although Haskell will (given it is a compiler that can deal with such optimization) factor the constructor away, we can conceptually assume that it is there.

But the main difference is that we did not define a type signature: we defined a new type, so the function signature stays:

reflectiveNames :: Reflective a => [VarName a]
reflectiveNames = map reflectiveName reflectiveDefaults

and we can not just write Text instead of VarName a, since a Text is not a VarName a. It also means that Haskell can perfectly derive what a is. If we would for instance trigger reflectiveNames :: [VarName Char], then it know that a is a Char, and it will thus use the instance of Reflective for a ~ Char. There is no ambiguity. Of course we can define aliasses like:

type Foo = VarName Char   -- a ~ Char
type Bar b = VarName Int  -- a ~ Int

But then still a is resolved to Char and Int respectively. Since this is a new type, we will always carry the type of a through the code, and hence the code is unambiguous.