4
votes

I find myself running up against the same pattern in my designs where I start with a type with a few data constructors, eventually want to be able to type against those data constructors and thus split them into their own types, just to then have to increase the verbosity of other parts of the program by needing to use Either or another tagged-union for situations where I still need to represent multiple of these types (namely collections).

I am hoping someone can point me to a better way of accomplishing what I'm trying to do. Let me start with a simple example. I am modeling a testing system, where you can have nested test suites which eventually end in tests. So, something like this:

data Node =
    Test { source::string }
    Suite { title::string, children::[Node] }

So, pretty simple so far, essentially a fancy Tree/Leaf declaration. However, I quickly realize that I want to be able to make functions that take Tests specifically. As such, I'll now split it up as so:

data Test = Test { source::string }
data Suite = Suite { title::string, children::[Either Test Suite] }

Alternatively I might roll a "custom" Either (especially if the example is more complicated and has more than 2 options), say something like:

data Node =
   fromTest Test
   fromSuite Suite

So, already its pretty unfortunate that just to be able to have a Suite that can have a combination of Suites or Tests I end up with a weird overhead Either class (whether it be with an actual Either or a custom one). If I use existential type classes, I could get away with making both Test and Suite derive "Node_" and then have Suite own a List of said Nodes. Coproducts would allow something similar, where I'd essentially do the same Either strategy without the verbosity of the tags.

Allow me to expand now with a more complex example. The results of the tests can be either Skipped (the test was disabled), Success, Failure, or Omitted (the test or suite could not be run due to a previous failure). Again, I originally started with something like this:

data Result = Success | Omitted | Failure | Skipped
data ResultTree =
    Tree { children::[ResultTree], result::Result } |
    Leaf Result

But I quickly realized I wanted to be able to write functions that took specific results, and more importantly, have the type itself enforce the ownership properties: A successful suite must only own Success or Skipped children, Failure's children can be anything, Omitted can only own Omitted, etc. So now I end up with something like this:

data Success = Success { children::[Either Success Skipped] }
data Failure = Failure { children::[AnyResult] }
data Omitted = Omitted { children::[Omitted] }
data Skipped = Skipped { children::[Skipped] }
data AnyResult =
  fromSuccess Success |
  fromFailure Failure |
  fromOmitted Omitted |
  fromSkipped Skipped

Again, I now have these weird "Wrapper" types like AnyResult, but, I get type enforcement of something that used to only be enforced from runtime operation. Is there a better strategy to this that doesn't involve turning on features like existential type classes?

2
@Jon Purdy did give a great answer below. For the curious, this singleton's tutorial goes in great depth about this design pattern.zeronone

2 Answers

3
votes

The first thing that came to my mind reading your sentence: "I quickly realized I wanted to be able to write functions that took specific results" is Refinement Types.

They allow to take only some values from a type as input, and make those constraints compile-time check/error.

There is this video from a talk at HaskellX 2018, that introduces LiquidHaskell, which allows the use of Refinement Types in Haskell:

https://skillsmatter.com/skillscasts/11068-keynote-looking-forward-to-niki-vazou-s-keynote-at-haskellx-2018

You have to decorate your haskell function signature, and have LiquidHaskell installed:

f :: Int -> i : Int {i | i < 3} -> Int would be a function which could only accept as second parameter an Int with a value < 3, checked at compile time.

You might as well put constraints on your Result type.

2
votes

I think what you may be looking for is GADTs with DataKinds. This lets you refine the types of each constructor in a data type to a particular set of possible values. For example:

data TestType = Test | Suite

data Node (t :: TestType) where
  TestNode :: { source :: String } -> Node 'Test
  SuiteNode :: { title :: String, children :: [SomeNode] } -> Node 'Suite

data SomeNode where
  SomeNode :: Node t -> SomeNode

Then when a function operates only on tests, it can take a Node 'Test; on suites, a Node 'Suite; and on either, a polymorphic Node a. When pattern-matching on a Node a, each case branch gets access to an equality constraint:

useNode :: Node a -> Foo
useNode node = case node of
  TestNode source ->          {- here it’s known that (a ~ 'Test) -}
  SuiteNode title children -> {- here, (a ~ 'Suite) -}

Indeed if you took a concrete Node 'Test, the SuiteNode branch would be disallowed by the compiler, since it can’t ever match.

SomeNode is an existential that wraps a Node of an unknown type; you can add extra class constraints to this if you want.

You can do a similar thing with Result:

data ResultType = Success | Omitted | Failure | Skipped

data Result (t :: ResultType) where
  SuccessResult
    :: [Either (Result 'Success) (Result 'Skipped)]
    -> Result 'Success
  FailureResult
    :: [SomeResult]
    -> Result 'Failure
  OmittedResult
    :: [Result 'Omitted]
    -> Result 'Omitted
  SkippedResult
    :: [Result 'Skipped]
    -> Result 'Skipped

data SomeResult where
  SomeResult :: Result t -> SomeResult

Of course I assume in your actual code there’s more information in these types; as it is, they don’t represent much. When you have a dynamic computation such as running a test that may produce different kinds of result, you can return it wrapped in SomeResult.

In order to work with dynamic results, you may need to prove to the compiler that two types are equal; for that, I direct you to Data.Type.Equality, which provides a type a :~: b which is inhabited by a single constructor Refl when the two types a and b are equal; you can pattern-match on this to inform the typechecker about type equalities, or use the various combinators to carry out more complicated proofs.

Also useful in conjunction with GADTs (and ExistentialTypes, less generally) is RankNTypes, which basically enables you to pass polymorphic functions as arguments to other functions; this is necessary if you want to consume an existential generically:

consumeResult :: SomeResult -> (forall t. Result t -> r) -> r
consumeResult (SomeResult res) k = k res

This is an example of continuation-passing style (CPS), where k is the continuation.

As a final note, these extensions are widely used and largely uncontroversial; you needn’t be wary of opting in to (most) type system extensions when they let you express what you mean more directly.