13
votes

Dipping my toe into the waters of dependent types, I had a crack at the canonical "list with statically-typed length" example.

{-# LANGUAGE DataKinds, GADTs, KindSignatures #-}

-- a kind declaration
data Nat = Z | S Nat

data SafeList :: (Nat -> * -> *) where
    Nil :: SafeList Z a
    Cons :: a -> SafeList n a -> SafeList (S n) a

-- the type signature ensures that the input list has at least one element
safeHead :: SafeList (S n) a -> a
safeHead (Cons x xs) = x

This seems to work:

ghci> :t Cons 5 (Cons 3 Nil)
Cons 5 (Cons 3 Nil) :: Num a => SafeList ('S ('S 'Z)) a

ghci> safeHead (Cons 'x' (Cons 'c' Nil))
'x'

ghci> safeHead Nil
Couldn't match type 'Z with 'S n0
Expected type: SafeList ('S n0) a0
  Actual type: SafeList 'Z a0
In the first argument of `safeHead', namely `Nil'
In the expression: safeHead Nil
In an equation for `it': it = safeHead Nil

However, in order for this data-type to be actually useful, I should be able to build it from run-time data for which you don't know the length at compile time. My naïve attempt:

fromList :: [a] -> SafeList n a
fromList = foldr Cons Nil

This fails to compile, with the type error:

Couldn't match type 'Z with 'S n
Expected type: a -> SafeList n a -> SafeList n a
  Actual type: a -> SafeList n a -> SafeList ('S n) a
In the first argument of `foldr', namely `Cons'
In the expression: foldr Cons Nil
In an equation for `fromList': fromList = foldr Cons Nil

I understand why this is happening: the return type of Cons is different for each iteration of the fold - that's the whole point! But I can't see a way around it, probably because I've not read deeply enough into the subject. (I can't imagine all this effort is being put into a type system that is impossible to use in practice!)

So: How can I build this sort of dependently-typed data from 'normal' simply-typed data?


Following @luqui's advice I was able to make fromList compile:

data ASafeList a where
    ASafeList :: SafeList n a -> ASafeList a

fromList :: [a] -> ASafeList a
fromList = foldr f (ASafeList Nil)
    where f x (ASafeList xs) = ASafeList (Cons x xs)

Here's my attempt to unpack the ASafeList and use it:

getSafeHead :: [a] -> a
getSafeHead xs = case fromList xs of ASafeList ys -> safeHead ys

This causes another type error:

Couldn't match type `n' with 'S n0
  `n' is a rigid type variable bound by
      a pattern with constructor
        ASafeList :: forall a (n :: Nat). SafeList n a -> ASafeList a,
      in a case alternative
      at SafeList.hs:33:22
Expected type: SafeList ('S n0) a
  Actual type: SafeList n a
In the first argument of `safeHead', namely `ys'
In the expression: safeHead ys
In a case alternative: ASafeList ys -> safeHead ys

Again, intuitively it makes sense that this would fail to compile. I can call fromList with an empty list, so the compiler has no guarantee that I'll be able to call safeHead on the resulting SafeList. This lack of knowledge is roughly what the existential ASafeList captures.

Can this problem be solved? I feel like I might have walked down a logical dead-end.

3
You'd better be getting a compile error; your code is unsafe! getSafeHead [] is ill-defined. Ask yourself: where do you guarantee that the list is non-empty?luqui
@luqui What about if I had data NonEmptyList a = NEEnd a | NECons a NonEmptyList a and my getSafeHead took that as a parameter?Benjamin Hodgson

3 Answers

16
votes

Never throw anything away.

If you're going to take the trouble to crank along a list to make a length-indexed list (known in the literature as a "vector"), you may as well remember its length.

So, we have

data Nat = Z | S Nat

data Vec :: Nat -> * -> * where -- old habits die hard
  VNil :: Vec Z a
  VCons :: a -> Vec n a -> Vec (S n) a

but we can also give a run time representation to static lengths. Richard Eisenberg's "Singletons" package will do this for you, but the basic idea is to give a type of run time representations for static numbers.

data Natty :: Nat -> * where
  Zy :: Natty Z
  Sy :: Natty n -> Natty (S n)

Crucially, if we have a value of type Natty n, then we can interrogate that value to find out what n is.

Hasochists know that run time representability is often so boring that even a machine can manage it, so we hide it inside a type class

class NATTY (n :: Nat) where
  natty :: Natty n

instance NATTY Z where
  natty = Zy

instance NATTY n => NATTY (S n) where
  natty = Sy natty

Now we can give a slightly more informative existential treatment of the length you get from your lists.

data LenList :: * -> * where
  LenList :: NATTY n => Vec n a -> LenList a

lenList :: [a] -> LenList a
lenList []        = LenList VNil
lenList (x : xs)  = case lenList xs of LenList ys -> LenList (VCons x ys)

You get the same code as the length-destroying version, but you can grab a run time representation of the length anytime you like, and you don't need to crawl along the vector to get it.

Of course, if you want the length to be a Nat, it's still a pain that you instead have a Natty n for some n.

It's a mistake to clutter one's pockets.

Edit I thought I'd add a little, to address the "safe head" usage issue.

First, let me add an unpacker for LenList which gives you the number in your hand.

unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs

And now suppose I define

vhead :: Vec (S n) a -> a
vhead (VCons a _) = a

enforcing the safety property. If I have a run time representation of the length of a vector, I can look at it to see if vhead applies.

headOrBust :: LenList a -> Maybe a
headOrBust lla = unLenList lla $ \ n xs -> case n of
  Zy    -> Nothing
  Sy _  -> Just (vhead xs)

So you look at one thing, and in doing so, learn about another.

5
votes

In

fromList :: [a] -> SafeList n a

n is universally quantified -- i.e. this signature is claiming that we should be able to build a SafeList of any length from the list. Instead you want to quantify existentially, which can only be done by defining a new data type:

data ASafeList a where
    ASafeList :: SafeList n a -> ASafeList a

Then your signature should be

fromList :: [a] -> ASafeList a

You can use it by pattern matching on ASafeList

useList :: ASafeList a -> ...
useList (ASafeList xs) = ...

and in the body, xs will be a SafeList n a type with an unknown (rigid) n. You will probably have to add more operations to use it in any nontrivial way.

1
votes

If you want to use dependently typed functions on runtime data, then you need to ensure, that this data doesn't violate encoded in type signatures laws. It's easier to understand this by an example. Here is our setup:

data Nat = Z | S Nat

data Natty (n :: Nat) where
    Zy :: Natty Z
    Sy :: Natty n -> Natty (S n)

data Vec :: * -> Nat -> * where
  VNil :: Vec a Z
  VCons :: a -> Vec a n -> Vec a (S n)

We can write some simple functions on Vec:

vhead :: Vec a (S n) -> a
vhead (VCons x xs) = x

vtoList :: Vec a n -> [a]
vtoList  VNil        = []
vtoList (VCons x xs) = x : vtoList xs

vlength :: Vec a n -> Natty n
vlength  VNil        = Zy
vlength (VCons x xs) = Sy (vlength xs)

For writing the canonical example of the lookup function we need the concept of finite sets. They are usually defined as

data Fin :: Nat -> where
    FZ :: Fin (S n)
    FS :: Fin n -> Fin (S n)

Fin n represents all numbers less than n.

But just like there is a type level equivalent of Nats — Nattys, there is a type level equivalent of Fins. But now we can incorporate value level and type level Fins:

data Finny :: Nat -> Nat -> * where
    FZ :: Finny (S n) Z
    FS :: Finny n m -> Finny (S n) (S m)

The first Nat is an upper bound of a Finny. And the second Nat corresponds to an actual value of a Finny. I.e. it must be equal to toNatFinny i, where

toNatFinny :: Finny n m -> Nat
toNatFinny  FZ    = Z
toNatFinny (FS i) = S (toNatFinny i)

Defining the lookup function is now straightforward:

vlookup :: Finny n m -> Vec a n -> a
vlookup  FZ    (VCons x xs) = x
vlookup (FS i) (VCons x xs) = vlookup i xs

And some tests:

print $ vlookup  FZ               (VCons 1 (VCons 2 (VCons 3 VNil))) -- 1
print $ vlookup (FS FZ)           (VCons 1 (VCons 2 (VCons 3 VNil))) -- 2
print $ vlookup (FS (FS (FS FZ))) (VCons 1 (VCons 2 (VCons 3 VNil))) -- compile-time error

That was simple, but what about the take function? It's not harder:

type Finny0 n = Finny (S n)

vtake :: Finny0 n m -> Vec a n -> Vec a m
vtake  FZ     _           = VNil
vtake (FS i) (VCons x xs) = VCons x (vtake i xs)

We need Finny0 instead of Finny, because lookup requires a Vec to be non-empty, so if there is a value of type Finny n m, then n = S n' for some n'. But vtake FZ VNil is perfectly valid, so we need to relax this restriction. So Finny0 n represents all numbers less or equal n.

But what about runtime data?

vfromList :: [a] -> (forall n. Vec a n -> b) -> b
vfromList    []  f = f VNil
vfromList (x:xs) f = vfromList xs (f . VCons x)

I.e. "give me a list and a function, that accepts a Vec of arbitrary length, and I'll apply the latter to the former". vfromList xs returns a continuation (i.e. something of type (a -> r) -> r) modulo higher-rank types. Let's try it:

vmhead :: Vec a n -> Maybe a
vmhead  VNil        = Nothing
vmhead (VCons x xs) = Just x

main = do
    print $ vfromList ([] :: [Int]) vmhead -- Nothing
    print $ vfromList  [1..5]       vmhead -- Just 1

Works. But aren't we just repeat ourself? Why vmhead, when there is vhead already? Should we rewrite all safe functions in an unsafe way to make is possible to use them on runtime data? That would be silly.

All we need is to ensure, that all invariants hold. Let's try this principle on the vtake function:

fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)

main = do       
    xs <- readLn :: IO [Int]
    i <- read <$> getLine
    putStrLn $
        fromIntFinny i $ \i' ->
        vfromList xs   $ \xs' ->
        undefined -- what's here?

fromIntFinny is just like vfromList. It's instructive to see, what the types are:

i'  :: Finny n m
xs' :: Vec a p

But vtake has this type: Finny0 n m -> Vec a n -> Vec a m. So we need to coerce i', so that it would be of type Finny0 p m. And also toNatFinny i' must be equal to toNatFinny coerced_i'. But this coercion is not possible in general, since if S p < n, then there are elements in Finny n m, that are not in Finny (S p) m, since S p and n are upper bounds.

coerceFinnyBy :: Finny n m -> Natty p -> Maybe (Finny0 p m)
coerceFinnyBy  FZ     p     = Just FZ
coerceFinnyBy (FS i) (Sy p) = fmap FS $ i `coerceFinnyBy` p
coerceFinnyBy  _      _     = Nothing

That's why there is Maybe here.

main = do       
    xs <- readLn :: IO [Int]
    i <- read <$> getLine
    putStrLn $
        fromIntFinny i $ \i' ->
        vfromList xs   $ \xs' ->
        case i' `coerceFinnyBy` vlength xs' of
            Nothing  -> "What should I do with this input?"
            Just i'' -> show $ vtoList $ vtake i'' xs'

In the Nothing case a number, that was read from the input, is bigger, than the length of a list. In the Just case a number is less or equal to the length of a list and coerced to the appropriate type, so vtake i'' xs' is well-typed.

This works, but we introduced the coerceFinnyBy function, that looks rather ad hoc. Decidable "less or equal" relation would be the appropriate alternative:

data (:<=) :: Nat -> Nat -> * where
    Z_le_Z :: Z :<= m                 -- forall n, 0 <= n
    S_le_S :: n :<= m -> S n :<= S m  -- forall n m, n <= m -> S n <= S m

type n :< m = S n :<= m

(<=?) :: Natty n -> Natty m -> Either (m :< n) (n :<= m) -- forall n m, n <= m || m < n
Zy   <=? m    = Right Z_le_Z
Sy n <=? Zy   = Left (S_le_S Z_le_Z)
Sy n <=? Sy m = either (Left . S_le_S) (Right . S_le_S) $ n <=? m

And a safe injecting function:

inject0Le :: Finny0 n p -> n :<= m -> Finny0 m p
inject0Le  FZ     _          = FZ
inject0Le (FS i) (S_le_S le) = FS (inject0Le i le)

I.e. if n is an upper bound for some number and n <= m, then m is an upper bound for this number too. And another one:

injectLe0 :: Finny n p -> n :<= m -> Finny0 m p
injectLe0  FZ    (S_le_S le) = FZ
injectLe0 (FS i) (S_le_S le) = FS (injectLe0 i le)

The code now looks like this:

getUpperBound :: Finny n m -> Natty n
getUpperBound = undefined

main = do
    xs <- readLn :: IO [Int]
    i <- read <$> getLine
    putStrLn $
        fromIntFinny i $ \i'  ->
        vfromList xs   $ \xs' ->
        case getUpperBound i' <=? vlength xs' of
            Left  _  -> "What should I do with this input?"
            Right le -> show $ vtoList $ vtake (injectLe0 i' le) xs'

It compiles, but what definition should getUpperBound have? Well, you can't define it. A n in Finny n m lives only at the type level, you can't extract it or get somehow. If we can't perform "downcast", we can perform "upcast":

fromIntNatty :: Int -> (forall n. Natty n -> b) -> b
fromIntNatty 0 f = f Zy
fromIntNatty n f = fromIntNatty (n - 1) (f . Sy)

fromNattyFinny0 :: Natty n -> (forall m. Finny0 n m -> b) -> b
fromNattyFinny0  Zy    f = f FZ
fromNattyFinny0 (Sy n) f = fromNattyFinny0 n (f . FS)

For comparison:

fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)

So a continuation in fromIntFinny is universally quantified over the n and m variables, while a continuation in fromNattyFinny0 is universally quantified over just m. And fromNattyFinny0 receives a Natty n instead of Int.

There is Finny0 n m instead of Finny n m, because FZ is an element of forall n m. Finny n m, while FZ is not necessarily an element of forall m. Finny n m for some n, specifically FZ is not an element of forall m. Finny 0 m (so this type is uninhabited).

After all, we can join fromIntNatty and fromNattyFinny0 together:

fromIntNattyFinny0 :: Int -> (forall n m. Natty n -> Finny0 n m -> b) -> b
fromIntNattyFinny0 n f = fromIntNatty n $ \n' -> fromNattyFinny0 n' (f n')

Achieving the same result, as in the @pigworker's answer:

unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs

Some tests:

main = do
    xs <- readLn :: IO [Int]
    ns <- read <$> getLine
    forM_ ns $ \n -> putStrLn $
        fromIntNattyFinny0 n $ \n' i' ->
        vfromList xs         $ \xs'   ->
        case n' <=? vlength xs' of
            Left  _  -> "What should I do with this input?"
            Right le -> show $ vtoList $ vtake (inject0Le i' le) xs'

for

[1,2,3,4,5,6]
[0,2,5,6,7,10]

returns

[]
[1,2]
[1,2,3,4,5]
[1,2,3,4,5,6]
What should I do with this input?
What should I do with this input?

The code: http://ideone.com/3GX0hd

EDIT

Well, you can't define it. A n in Finny n m lives only at the type level, you can't extract it or get somehow.

That's not true. Having SingI n => Finny n m -> ..., we can get n as fromSing sing.