2
votes

There's plenty of Q&A about GADTs being better than DatatypeContexts, because GADTs automagically make constraints available in the right places. For example here, here, here. But sometimes it seems I still need an explicit constraint. What's going on? Example adapted from this answer:

{-# LANGUAGE  GADTs  #-}
import Data.Maybe                                              -- fromJust

data GADTBag a  where
  MkGADTBag :: Eq a => { unGADTBag :: [a] } -> GADTBag a

baz (MkGADTBag x) (Just y) = x == y

baz2           x        y  = unGADTBag x == fromJust y


-- unGADTBag :: GADTBag a -> [a]                               -- inferred, no Eq a

-- baz ::          GADTBag a -> Maybe [a] -> Bool              -- inferred, no Eq a
-- baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool              -- inferred, with Eq a

Why can't the type for unGADTBag tell us Eq a?

baz and baz2 are morally equivalent, yet have different types. Presumably because unGADTBag has no Eq a, then the constraint can't propagate into any code using unGADTBag.

But with baz2 there's an Eq a constraint hiding inside the GADTBag a. Presumably baz2's Eq a will want a duplicate of the dictionary already there(?)

Is it that potentially a GADT might have many data constructors, each with different (or no) constraints? That's not the case here, or with typical examples for constrained data structures like Bags, Sets, Ordered Lists.

The equivalent for a GADTBag datatype using DatatypeContexts infers baz's type same as baz2.

Bonus question: why can't I get an ordinary ... deriving (Eq) for GADTBag? I can get one with StandaloneDeriving, but it's blimmin obvious, why can't GHC just do it for me?

deriving instance (Eq a) => Eq (GADTBag a)

Is the problem again that there might be other data constructors?

(Code exercised at GHC 8.6.5, if that's relevant.)

Addit: in light of @chi's and @leftroundabout's answers -- neither of which I find convincing. All of these give *** Exception: Prelude.undefined:

*DTContexts> unGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag (undefined :: String)
*DTContexts> unGADTBag $ MkGADTBag (undefined :: [a])

*DTContexts> baz undefined (Just "hello")
*DTContexts> baz (MkGADTBag undefined) (Just "hello")
*DTContexts> baz (MkGADTBag (undefined :: String)) (Just "hello")

*DTContexts> baz2 undefined (Just "hello")
*DTContexts> baz2 (MkGADTBag undefined) (Just "hello")
*DTContexts> baz2 (MkGADTBag (undefined :: String)) (Just "hello")

Whereas these two give the same type error at compile time * Couldn't match expected type ``[Char]'* No instance for (Eq (Int -> Int)) arising from a use of ``MkGADTBag'/ ``baz2' respectively [Edit: my initial Addit gave the wrong expression and wrong error message]:

*DTContexts> baz (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])
*DTContexts> baz2 (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])

So baz, baz2 are morally equivalent not just in that they return the same result for the same well-defined arguments; but also in that they exhibit the same behaviour for the same ill-defined arguments. Or they differ only in where the absence of an Eq instance gets reported?

@leftroundabout Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies.

Yes there is: field label unGADTBag is defined if and only if there's a pattern match on MkGADTBag. (It would maybe be different if there were other constructors for the type -- especially if those also had a label unGADTBag.) Again, being undefined/lazy evaluation doesn't postpone the type-inference.

To be clear, by "[not] convincing" I mean: I can see the behaviour and the inferred types I'm getting. I don't see that laziness or potential undefinedness gets in the way of type inference. How could I expose a difference between baz, baz2 that would explain why they have different types?

5
For quux x = const x (x == x) GHC infers quux :: Eq a => a -> a. unGADTBag demands MkGADTBag demands Eq a in the same way as quux, whether or not any equality test is evaluated or Eq dictionary is applied. Then why does unGADTBag lose the Eq a?AntC

5 Answers

5
votes

Function calls never bring type class constraints in scope, only (strict) pattern matching does.

The comparison

unGADTBag x == fromJust y

is essentially a function call of the form

foo (unGADTBag x) (fromJust y)

where foo requires Eq a. That would morally be provided by unGADTBag x, but that expression is not yet evaluated! Because of laziness, unGADTBag x will be evaluated only when (and if) foo demands its first argument.

So, in order to call foo in this example we need its argument to be evaluated in advance. While Haskell could work like this, it would be a rather surprising semantics, where arguments are evaluated or not depending on whether they provide a type class constraint which is needed. Imagine more general cases like

foo (if cond then unGADTBag x else unGADTBag z) (fromJust y)

What should be evaluated here? unGADTBag x? unGADTBag y? Both? cond as well? It's hard to tell.

Because of these issues, Haskell was designed so that we need to manually require the evaluation of a GADT value like x using pattern matching.

1
votes

Why can't the type for unGADTBag tell us Eq a?

Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies. Sure, if it doesn't then you have other problems (bottom), but those might conceivably not surface. Consider

ignore :: a -> b -> b
ignore _ = id

baz2' :: GADTBag a -> Maybe [a] -> Bool
baz2' x y = ignore (unGADTBag x) (y==y)

Note that I could now invoke the function with, say, undefined :: GADTBag (Int->Int). Shouldn't be a problem since the undefined is ignored, right? Problem is, despite Int->Int not having an Eq instance, I was able to write y==y, which y :: Maybe [Int->Int] can't in fact support.

So, we can't have that only mentioning unGADTBag is enough to spew the Eq a constraint into its surrounding scope. Instead, we must clearly delimit the scope of that constraint to where we've confirmed that the MkGADTBag constructor does apply, and a pattern match accomplishes that.


If you're annoyed that my argument relies on undefined, note that the same issue arises also when there are multiple constructors which would bring different constraints into scope.


An alternative to a pattern-match that does work is this:

{-# LANGUAGE RankNTypes #-}

withGADTBag :: GADTBag a -> (Eq a => [a] -> b) -> b
withGADTBag (MkGADTBag x) f = f x

baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 x y = withGADTBag x (== fromJust y)

Response to edits

All of these give *** Exception: Prelude.undefined:

Yes of course they do, because you actually evaluate x == y in your function. So the function can only possibly yield non- if the inputs have a NF. But that's by no means the case for all functions.

Whereas these two give the same type error at compile time

Of course they do, because you're trying to wrap a value of non-Eq type in the MkGADTBag constructor, which explicitly requires that constraint (and allows you to explicitly unwrap it again!), whereas the GADTBag type doesn't require that constraint. (Which is kind of the whole point about this sort of encapsulation!)

Before you've actually deconstructed the x value, there's no way of knowing that the `MkGADTBag` constructor indeed applies.
Yes there is: field label `unGADTBag` is defined if and only if there's a pattern match on `MkGADTBag`.

Arguably, that's the way field labels should work, but they don't, in Haskell. A field label is nothing but a function from the data type to the field type, and a nontotal function at that if there are multiple constructors.
Yeah, Haskell records are one of the worst-designed features of the language. I personally tend to use field labels only for big, single-constructor, plain-old-data types (and even then I prefer using not the field labels directly but lenses derived from them).

Anyway though, I don't see how “field label is defined iff there's a pattern match” could even be implemented in a way that would allow your code to work the way you think it should. The compiler would have to insert the step of confirming that the constructor applies (and extracting its GADT-encapsulated constraint) somewhere. But where? In your example it's reasonably obvious, but in general x could inhabit a vast scope with lots of decision branches and you really don't want it to get evaluated in a branch where the constraint isn't actually needed.

Also keep in mind that when we argue with undefined/ it's not just about actually diverging computations, more typically you're worried about computations that would simply take a long time (just, Haskell doesn't actually have a notion of “taking a long time”).

1
votes

The way to think about this is OutsideIn(X) ... with local assumptions. It's not about undefinedness or lazy evaluation. A pattern match on a GADT constructor is outside, the RHS of the equation is inside. Constraints from the constructor are made available only locally -- that is only inside.

baz (MkGADTBag x) (Just y) = x == y

Has an explicit data constructor MkGADTBag outside, supplying an Eq a. The x == y raises a wanted Eq a locally/inside, which gets discharged from the pattern match. OTOH

baz2           x        y  = unGADTBag x == fromJust y

Has no explicit data constructor outside, so no context is supplied. unGADTBag has a Eq a, but that is deeper inside the l.h. argument to ==; type inference doesn't go looking deeper inside. It just doesn't. Then in the effective definition for unGADTBag

unGADTBag (MkGADTBag x) = x

there is an Eq a made available from the outside, but it cannot escape from the RHS into the type environment at a usage site for unGADTBag. It just doesn't. Sad!

The best I can see for an explanation is towards the end of the OutsideIn paper, Section 9.7 Is the emphasis on principal types well-justified? (A rhetorical question but my answer would me: of course we must emphasise principal types; type inference could get better principaled under some circumstances.) That last section considers this example

    data R a where
      RInt :: Int -> R Int
      RBool :: Bool -> R Bool
      RChar :: Char -> R Char

    flop1 (RInt x) = x

there is a third type that is arguably more desirable [for flop1], and that type is R Int -> Int.

flop1's definition is of the same form as unGADTBag, with a constrained to be Int.

    flop2 (RInt x) = x
    flop2 (RBool x) = x

Unfortunately, ordinary polymorphic types are too weak to express this restriction [that a must be only Int or Bool] and we can only get Ɐa.R a -> a for flop2, which does not rule the application of flop2 to values of type R Char.

So at that point the paper seems to give up trying to refine better principal types:

In conclusion, giving up on some natural principal types in favor of more specialized types that eliminate more pattern match errors at runtime is appealing but does not quite work unless we consider a more expressive syntax of types. Furthermore it is far from obvious how to specify these typings in a high-level declarative specification.

"is appealing". It just doesn't.

I can see a general solution is difficult/impossible. But for use-cases of constrained Bags/Lists/Sets, the specification is:

  • All data constructors have the same constraint(s) on the datatype's parameters.
  • All constructors yield the same type (... -> T a or ... -> T [a] or ... -> T Int, etc).
  • Datatypes with a single constructor satisfy that trivially.

To satisfy the first bullet, for a Set type using a binary balanced tree, there'd be a non-obvious definition for the Nil constructor:

data OrdSet a  where
  SNode :: Ord a => OrdSet a -> a -> OrdSet a -> OrdSet a
  SNil  :: Ord a => OrdSet a                     -- seemingly redundant Ord constraint

Even so, repeating the constraint on every node and every terminal seems wasteful: it's the same constraint all the way down (which is unlike GADTs for EDSL abstract syntax trees); presumably each node carries a copy of exactly the same dictionary.

The best way to ensure same constraint(s) on every constructor could just be prefixing the constraint to the datatype:

data Ord a => OrdSet a  where ...

And perhaps the constraint could go 'OutsideOut' to the environment that's accessing the tree.

0
votes

Another possible approach is to use a PatternSynonym with an explicit signature giving a Required constraint.

pattern EqGADTBag :: Eq a => [a] -> GADTBag a              -- that Eq a is the *Required*
pattern EqGADTBag{ unEqGADTBag } = MkGADTBag unEqGADTBag   -- without sig infers Eq a only as *Provided*

That is, without that explicit sig:

*> :i EqGADTBag
pattern EqGADTBag :: () => Eq a => [a] -> GADTBag a

The () => Eq a => ... shows Eq a is Provided, arising from the GADT constructor.

Now we get both inferred baz, baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool:

baz (EqGADTBag x) (Just y) = x == y

baz2           x        y  = unEqGADTBag x == fromJust y

As a curiosity: it's possible to give those equations for baz, baz2 as well as those in the O.P. using the names from the GADT decl. GHC warns of overlapping patterns [correctly]; and does infer the constrained sig for baz.

I wonder if there's a design pattern here? Don't put constraints on the data constructor -- that is, don't make it a GADT. Instead declare a 'shadow' PatternSynonym with the Required/Provided constraints.

0
votes

You can capture the constraint in a fold function, (Eq a => ..) says you can assume Eq a but only within the function next (which is defined after a pattern match). If you instantiate next as = fromJust maybe == as it uses this constraint to witness equality

--              local constraint 
--              |
--              vvvvvvvvvvvvvvvvvv
foldGADTBag :: (Eq a => [a] -> res) -> GADTBag a -> res
foldGADTBag next (MkGADTBag as) = next as

baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 gadtBag maybe = foldGADTBag (fromJust maybe ==) gadtBag

type Ty :: Type -> Type
data Ty a where
 TyInt  :: Int -> Ty Int
 TyUnit :: Ty ()

--         locally assume Int       locally assume unit
--         |                        |
--         vvvvvvvvvvvvvvvvvvv      vvvvvvvvvvvvv
foldTy :: (a ~ Int => a -> res) -> (a ~ () => res) -> (Ty a -> res)
foldTy int unit (TyInt i) = int i
foldTy int unit TyUnit    = unit

eval :: Ty a -> a
eval = foldTy id ()