(This is a follow-up to this answer, trying to get the q more precise.)
Use Case Constructors to build/access a Set
datatype. Being a set, the invariant is 'no duplicates'. To implement that I need an Eq
constraint on the element type. (More realistically, the set might be implemented as a BST or hash-index, which'll need a more restrictive constraint; using Eq
here to keep it simple.)
- I want to disallow building even an empty set with an unacceptable type.
- " it's now considered bad practice to require a constraint for an operation (data type construction or destruction) that does not need the constraint. Instead, the constraints should be moved closer to the "usage site".", to quote that answer.
- OK so there's no need to get the constraint 'built into' the data structure. And operations that access/deconstruct (like
show
ing or counting elements) won't necessarily needEq
.
Then consider two (or rather five) possible designs:
(I'm aware some of the constraints could be achieved via deriving
, esp Foldable
to get elem
. But I'll hand-code here, so I can see what minimal constraints GHC wants.)
Option 1: No constraints on datatype
data NoCSet a where -- no constraints on the datatype
NilSet_ :: NoCSet a
ConsSet_ :: a -> NoCSet a -> NoCSet a
Option 1a. use PatternSynonym as 'smart constructor'
pattern NilSet :: (Eq a) => () => NoCSet a
pattern NilSet = NilSet_
pattern ConsSet x xs <- ConsSet_ x xs where
ConsSet x xs | not (elemS x xs) = ConsSet_ x xs
elemS x NilSet_ = False
elemS x (ConsSet_ y ys) | x == y = True -- infers (Eq a) for elemS
| otherwise = elemS x ys
GHC infers the constraint
elemS :: Eq t => t -> NoCSet t -> Bool
. But it doesn't infer a constraint forConsSet
that uses it. Rather, it rejects that definition:* No instance for (Eq a) arising from a use of `elemS' Possible fix: add (Eq a) to the "required" context of the signature for pattern synonym `ConsSet'
Ok I'll do that, with an explicitly empty 'Provided' constraint:
pattern ConsSet :: (Eq a) => () => ConsType a -- Req => Prov'd => type; Prov'd is empty, so omittable
Consequently inferred type
(\(ConsSet x xs) -> x) :: Eq a => NoCSet a -> a
, so the constraint 'escapes' from the destructor (also fromelemS
), whether or not I need it at the "usage site".
Option 1b. Pattern synonym wrapping a GADT constructor as 'smart constructor'
data CSet a where CSet :: Eq a => NoCSet a -> CSet a -- from comments to the earlier q
pattern NilSetC = CSet NilSet_ -- inferred Eq a provided
pattern ConsSetC x xs <- CSet (ConsSet_ x xs) where -- also inferred Eq a provided
ConsSetC x xs | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer
pattern ConsSetC :: () => Eq a => a -> NoCSet a -> CSet a
a Provided constraint, but empty Required.Inferred
(\(ConsSetC x xs) -> x) :: CSet p -> p
, so the constraint doesn't escape from a "usage site".But there's a bug: to Cons an element, I need to unwrap the
NoCSet
inside theCSet
in the tail then re-wrap aCSet
. And trying to do that with theConsSetC
pattern alone is ill typed. Instead:insertCSet x (CSet xs) = ConsSetC x xs -- ConsSetC on rhs, to check for duplicates
As 'smart constructors' go, that's dumb. What am I doing wrong?
Inferred
insertCSet :: a -> CSet a -> CSet a
, so again the constraint doesn't escape.
Option 1c. Pattern synonym wrapping a GADT constructor as 'smarter constructor'
Same setup as option 1b, except this monster as ViewPattern for the Cons pattern
pattern ConsSetC2 x xs <- ((\(CSet (ConsSet_ x' xs')) -> (x', CSet xs')) -> (x, xs)) where ConsSetC2 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer
pattern ConsSetC2 :: a -> CSet a -> CSet a
with no constraint at all. I'm nervous. But it does correctly reject attempts to build a set with duplicates.Inferred
(\(ConsSetC2 x xs) -> x) :: CSet a -> a
, so the constraint that isn't there doesn't escape from a "usage site".Edit: ah, I can get a somewhat less monstrous
ViewPattern
expression to workpattern ConsSetC3 x xs <- (CSet (ConsSet_ x (CSet -> xs))) where ConsSetC3 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
Curiously inferred
pattern ConsSetC3 :: () => Eq a => a -> CSet a -> CSet a
-- so the Provided constraint is visible, unlike withConsSetC2
, even though they're morally equivalent. It does reject attempts to build a set with duplicates.Inferred
(\(ConsSetC3 x xs) -> x) :: CSet p -> p
, so that constraint that is there doesn't excape from "usage sites".
Option 2: GADT constraints on datatype
data GADTSet a where
GADTNilSet :: Eq a => GADTSet a
GADTConsSet :: Eq a => a -> GADTSet a -> GADTSet a
elemG x GADTNilSet = False
elemG x (GADTConsSet y ys) | x == y = True -- no (Eq a) 'escapes'
| otherwise = elemG x ys
GHC infers no visible constraint elemG :: a -> GADTSet a -> Bool
; (\(GADTConsSet x xs) -> x) :: GADTSet p -> p
.
Option 2a. use PatternSynonym as 'smart constructor' for the GADT
pattern ConsSetG x xs <- GADTConsSet x xs where
ConsSetG x xs | not (elemG x xs) = GADTConsSet x xs -- does infer Provided (Eq a) for ConsSetG
- GHC doesn't complain about the lack of signature, does infer
pattern ConsSetG :: () => Eq a => a -> GADTSet a -> GADTSet a
a Provided constraint, but empty Required. - Inferred
(\(ConsSetG x xs) -> x) :: GADTSet p -> p
, so the constraint doesn't escape from a "usage site".
Option 2b. define an insert function
insertGADTSet x xs | not (elemG x xs) = GADTConsSet x xs -- (Eq a) inferred
- GHC infers
insertGADTSet :: Eq a => a -> GADTSet a -> GADTSet a
; so theEq
has escaped, even though it doesn't escape fromelemG
.
Questions
- With
insertGADTSet
, why does the constraint escape? It's only needed for theelemG
check, butelemG
's type doesn't expose the constraint. - With constructors
GADTConsSet, GADTNilSet
, there's a constraint wrapped 'all the way down' the data structure. Does that mean the data structure has a bigger memory footprint than withConsSet_, NilSet_
? - With constructors
GADTConsSet, GADTNilSet
, it's the same typea
'all the way down'. Is the sameEq a
dictionary repeated at each node? Or shared? - By comparison, pattern
ConsSetC
/constructorCSet
/Option 1b wraps only a single dictionary(?), so it'll have a smaller memory footprint than aGADTSet
structure(?) insertCSet
has a performance hit of unwrapping and wrappingCSet
s?ConsSetC2
in the build direction seems to work; there's a performance hit in unwrapping and wrappingCSet
s? But worse there's a unwrapping/wrapping performance hit in accessing/walking the nodes?
(I'm feeling there's no slam-dunk winner amongst these options for my use case.)