35
votes

I'm writing an application that uses UTF-16 strings, and to make use of the overloaded strings extension I tried to make an IsString instance for it:

import Data.Word ( Word16 )
import Data.String ( IsString(fromString) )

type String16 = [Word16]

instance IsString [Word16] where
    fromString = encodeUTF16

encodeUTF16 :: String -> String16

The problem is, when I try to compile the module, GHC 7.0.3 complains:

Data/String16.hs:35:10:
    Illegal instance declaration for `IsString [Word16]'
      (All instance types must be of the form (T a1 ... an)
       where a1 ... an are *distinct type variables*,
       and each type variable appears at most once in the instance head.
       Use -XFlexibleInstances if you want to disable this.)
    In the instance declaration for `IsString [Word16]'

If I comment out the instance declaration, it compiles successfully.

Why is this rejected? The instance for [Char] looks pretty much like the same thing, yet it compiles fine. Is there something I've missed?

2
You should consider using text, which uses UTF-16 internally. Or at least a newtype wrapper around [Word16], to avoid problems and conflicts of this sort.ehird
@ehird Thanks for the suggestion. I'm trying to implement Java's string hashing function which works on 16-bit characters. Unfortunately, the text package doesn't have an easy way of working on the raw Word16 without resorting to dark magic.Lambda Fairy
If you import Data.Text.Internal, you can access the underlying Array.ehird
Well, case s of { Text array offs len -> A.toList array offs len } isn't too bad :)ehird
You could also encode it into a UTF-16 ByteString, but that probably won't help you. Anyway, I'd definitely suggest a newtype around the list at the very least.ehird

2 Answers

93
votes

After having a look through the GHC manuals and around the Haskell wiki (especially the List instance page), I've got a better idea of how this works. Here's a summary of what I've learned:

Problem

The Haskell Report defines an instance declaration like this:

The type (T u1 … uk) must take the form of a type constructor T applied to simple type variables u1, … uk; furthermore, T must not be a type synonym, and the ui must all be distinct.

The parts highlighted in bold are the restrictions that tripped me up. In English, they are:

  1. Anything after the type constructor must be a type variable.
  2. You can't use a type alias (using the type keyword) to get around rule 1.

So how does this relate to my problem?

[Word16] is just another way of writing [] Word16. In other words, [] is the constructor and Word16 is its argument.

So if we try to write:

instance IsString [Word16]

which is the same as

instance IsString ([] Word16) where ...

it won't work, because it violates rule 1, as the compiler kindly points out.

Trying to hide it in a type synonym with

type String16 = [Word16]
instance IsString String16 where ...

won't work either, because it violates part 2.

So as it stands, it is impossible to get [Word16] (or a list of anything, for that matter) to implement IsString in standard Haskell.

Enter... (drumroll please)

Solution #1: newtype

The solution @ehird suggested is to wrap it in a newtype:

newtype String16 = String16 { unString16 :: [Word16] }
instance IsString String16 where ...

It gets around the restrictions because String16 is no longer an alias, it's a new type (excuse the pun)! The only downside to this is we then have to wrap and unwrap it manually, which is annoying.

Solution #2: Flexible instances

At the expense of portability, we can drop the restriction altogether with flexible instances:

{-# LANGUAGE FlexibleInstances #-}

instance IsString [Word16] where ...

This was the solution @[Daniel Wagner] suggested.

Solution #3: Equality constraints

Finally, there's an even less portable solution using equality constraints:

{-# LANGUAGE TypeFamilies #-}

instance (a ~ Word16) => IsString [a] where ...

This works better with type inference, but is more likely to overlap. See Chris Done's article on the topic.

(By the way, I ended up making a foldl' wrapper around Data.Text.Internal and writing the hash on top of that.)

-12
votes

Why is this rejected?

Because:

  (All instance types must be of the form (T a1 ... an)
   where a1 ... an are *distinct type variables*,
   and each type variable appears at most once in the instance head.

Is there something I've missed?

Yes:

   Use -XFlexibleInstances if you want to disable this.)