4
votes

In an attempt to improve my understanding of Haskell, I have started a personal project that allows users to combine many different predefined transforms that rely on an environment and state which are polymorphic.

The core types are organized around an environment parameterized over c, a state parameterized over the result type a, a typeclass Base that is parameterized over c and determines the type of a, and a typeclass Step that gives the interface by which the user-selectable transforms are defined in the RWS monad, and which is parameterized over both c and a:

type Plan c a = Control.Monad.RWS.RWS (Env c) Log (State a)

data Env c = Env c (Set Condition)

data State a = State a (Set Constraint)

class Base c b a | b -> a where
  execBase :: Env c -> b -> (State a, Log)

class Step c a s where
  defineStep :: s -> Plan c a ()
-- ^ Plans get folded into a single plan with >>

execPlan :: (Base c b a) => Env c -> b -> Plan c a () -> (a, Log)

The rest of the code base mostly defines a couple different data types the user can slot into the c part of Env, a few data types that can be the result a, and scads of data types that exist to hold a parameter or two and be instances of either Base or Step. Where it all falls down is that I can’t figure out how to parse any of this out of a user-supplied JSON document. I started with:

data Request c a = Request (Env c) (WrappedBase c a) [WrappedStep c a]

data WrappedBase c a where
  WrapBase :: (Base c b a, Eq b, Show b, Typeable a, Typeable b)
           => b -> WrappedBase c a

data WrappedStep c a where
  WrapStep :: (Step c a s, Eq s, Show s, Typeable s)
           => s -> WrappedStep c a

But I can’t figure out how to convince GHC to let me make a Data.Aeson.FromJSON instance for Request c a. It’s trivial to write a data type SomeC that’s a sum type over all the possible cases for c, and nearly as easy to write a parser for SomeC along with a function :: c -> Data.Aeson.Value -> Data.Aeson.Parser (Env c), but how would I turn that into a parser for Env c in such a way that I could then unify that c with the other cs in Request?

(I also tried converting the parser into a continuation-passing style, but realized once I had done it that I hadn’t fixed the problem at all.)

And the deeper mystery, how do I get GHC to perform the type function b -> a at the value level, so that I can make the a in Request the a indicated by the Base instance of b or, alternately, return a message to users letting them know the b they’ve chosen isn’t defined for the c that they specified?

It feels like what I want is to use type equality witnesses, but with type classes instead of types of kind :: *, but I searched through GHC’s thicket of extensions without finding something that would allow that.

1
Can you simplify this a bit? Surely not all details in the code you posted are relevant to the question you're asking. OTOH, adding a bit of clarification would be nice.leftaroundabout
@leftaroundabout: I trimmed the code even further, and I made the question less chatty overall. What would it be helpful for me to clarify?Jason Whittle
@JasonWhittle Thank you for reducing from the full code to just the relevant API. Ideally you'd go one step further and work out the minimal API that still exhibits the typing problem -- how few data types and classes can you make it and still have trouble writing your parser?Daniel Wagner
Thank you for taking a look, @DanielWagner; I think the big problem with this question is that it should have been two questions, but I didn’t understand the problem well enough to separate them out. However, K. A. Buhr’s magisterial answer deals with both using existential types and far better logic than I had been capable of.Jason Whittle

1 Answers

2
votes

If you were able to write a parser for Request c a, that would imply that the result of parsing the JSON is polymorphic in c and a, so a caller could take the result and use it as a Request Int Double and then as a Request String Bool, and both of those would make sense. This probably isn't what you want.

I'm going to guess here that you have a collection of environment types Env c, base types b, and step types s, each of which has a standalone FromJSON instance that can parse it, in ignorance of any other types. (So, for example, a particular base type MyBase can be parsed into a MyBase value without knowing the environment c or state a types it will be used with.)

Clearly, a specific concrete request involves exactly one environment type Env c and exactly one base type b. I'm a little fuzzy on whether the list of steps are intended to all be steps of the same type or a heterogeneous list of steps of different types, but I'll assume the latter. If so, the desired final result of your parse will be the nested existential request type:

data SomeRequest where
  SomeRequest :: (Base c b a) => Env c -> b -> [SomeStep c a] -> SomeRequest
data SomeStep c a where
  SomeStep :: (Step c a s) => s -> SomeStep c a

When you run such a request, it produces a final result (i.e., the final state a) which itself must be existential. The value will be useless to you unless you introduce some constraint. For simplicity, we'll use Show, though ToJSON might be a good choice, if you were planning to send the result back to the requester. We'll need to add this constraint to the SomeRequest type, too:

data SomeRequest where
  SomeRequest :: (Show a, Base c b a) => Env c -> b -> [SomeStep c a] -> SomeRequest
data SomeResult where
  SomeResult :: (Show a) => a -> SomeResult

To run an existential request to get an existential result, you'd use something like:

runRequest :: SomeRequest -> SomeResult
runRequest (SomeRequest e b ss) = SomeResult $ execPlan e b (mapM_ runStep ss)

runStep :: SomeStep c a -> RWS (Env c) Log (State a) ()
runStep (SomeStep s) = defineStep s

execPlan :: (Base c b a) => Env c -> b -> Plan c a () -> a
execPlan e b p = case execRWS p e (execBase e b) of (State a, _) -> a

And you might use runRequest like so:

main = do
  let r = parseRequest "<some JSON input>"
      result = runRequest r
  case result of SomeResult r -> print r

Now, we finally get down to your key problem. How do you write:

parseRequest :: String -> SomeRequest

This isn't really an Aeson problem, as far as I can see, and trying to make it one will complicate the underlying principles, so let's ignore the actual parsing, and go through the type-level programming you need to perform on the parse results.

Suppose we have the following environment, base, step, and state/result types and valid instances:

-- environments (c)
data C1 = C1 Int
data C2 = C2 String

-- bases (b)
data B1 = B1 Double
data B2 = B2 ()

-- steps (s)
data S1 = S1 Double
data S2 = S2 (Maybe Double)
data S3 = S3 ()

-- results (a)
data A1 = A1 Char deriving (Show)
data A2 = A2 Double deriving (Show)

-- valid base instances
instance Base C1 B1 A1
instance Base C2 B1 A1
instance Base C1 B2 A2

-- valid step instances
instance Step C1 A1 S1
instance Step C1 A1 S2
instance Step C2 A1 S2
instance Step C1 A2 S3

I assume here that you can parse your environments, bases, and steps into sum types. I know you have a lot of bases and steps, but I don't see any way that you can avoid enumerating them all anyway. After all, you need to give Aeson the full set of valid bases and the full set of valid steps, so you might as well use a sum type to drive the parsing and serve as centralized enumeration of bases and steps.

-- parse environment
data SomeC = C1_ C1 | C2_ C2
parseC :: String -> SomeC
parseC = undefined

-- parse base
data SomeB = B1_ B1 | B2_ B2
parseB :: String -> SomeB
parseB = undefined

-- parse list of steps
data SomeS = S1_ S1 | S2_ S2 | S3_ S3
parseSList :: String -> [SomeS]
parseSList = undefined

We need to check the validity of the environment/base combinations, but there's no way to automatically enumerate instances, so we need to make all possible combinations explicit. One way is to enumerate them in a series of case statements:

parseRequest :: String -> SomeRequest
parseRequest inp
  = case (parseC inp, parseB inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1) -> SomeRequest (Env c1) b1 []
      -- instance Base C2 B1 A1
      (C2_ c2, B1_ b1) -> SomeRequest (Env c2) b1 []
      -- instance Base C1 B2 A2
      (C1_ c1, B2_ b2) -> SomeRequest (Env c1) b2 []
      (_, _) -> error "incompatible environment/base combination"

This works fine for an empty step list. Note that even though the SomeRequest calls need to bundle up appropriate Base c b a dictionaries, there are no error messages about ambiguous a types. That's because the functional dependency has reconciled it from the base type; this is how the type function b -> a gets run at the value level. After a case match that determines the type b, invoking SomeRequest to demand a Base c b a dictionary selects the appropriate a.

If we try to modify this for non-empty step lists, we run into a bit of a problem:

parseRequest inp
  = case (parseC inp, parseB inp, parseS inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1, ss) -> SomeRequest (Env c1) b1 ??

Here, we have ss :: [SomeS] which can be steps of any possible step type. In order to fill in ??, we need type-level evidence of a Step c a s dictionary to pack it into a [SomeStep c a] field in the SomeRequest.

We could generate this type-level evidence using nested cases, like so:

parseRequest inp
  = case (parseC inp, parseB inp, parseSList inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1, ss) -> SomeRequest (Env c1) b1 $
        map (\s -> case s of
                -- instance Step C1 A1 S1
                S1_ s1 -> SomeStep s1
                -- instance Step C1 A1 S2
                S2_ s2 -> SomeStep s2)
        ss
      ...

but this will generate a lot of cases.

We still can't avoid making the enumeration of Step cases explicit, but it would be much better to consider only the relatively small number of combinations of types c and a and enumerate valid steps for each of those in one place.

The easiest way of doing this is just to define a step-checking functions for each C/A combination using our existing SomeStep existential:

someStepC1A1 :: SomeS -> SomeStep C1 A1
-- instance Step C1 A1 S1
someStepC1A1 (S1_ s) = SomeStep s
-- instance Step C1 A1 S2
someStepC1A1 (S2_ s) = SomeStep s
someStepC1A1 _ = error "bad step for C1/A1 combination"

someStepC2A1 :: SomeS -> SomeStep C2 A1
-- instance Step C2 A1 S2
someStepC2A1 (S2_ s) = SomeStep s
someStepC2A1 _ = error "bad step for C2/A1 combination"

someStepC1A2 :: SomeS -> SomeStep C1 A2
-- instance Step C1 A2 S3
someStepC1A2 (S3_ s) = SomeStep s
someStepC1A2 _ = error "bad step for C1/A2 combination"

and write parseRequest to use the appropriate functions:

parseRequest :: String -> SomeRequest
parseRequest inp
  = case (parseC inp, parseB inp, parseSList inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1, ss) -> SomeRequest (Env c1) b1 (someStepC1A1 <$> ss)
      -- instance Base C2 B1 A1
      (C2_ c2, B1_ b1, ss) -> SomeRequest (Env c2) b1 (someStepC2A1 <$> ss)
      -- instance Base C1 B2 A2
      (C1_ c1, B2_ b2, ss) -> SomeRequest (Env c1) b2 (someStepC1A2 <$> ss)
      (_, _, _) -> error "incompatible environment/base combination"

However, we can reduce some duplication by making someStep a typeclass method:

class ToSomeStep c a where
  someStep :: SomeS -> SomeStep c a

instance ToSomeStep C1 A1 where
  -- instance Step C1 A1 S1
  someStep (S1_ s) = SomeStep s
  -- instance Step C1 A1 S2
  someStep (S2_ s) = SomeStep s
  someStep _ = error "bad step for C1/A1 combination"

instance ToSomeStep C2 A1 where
  -- instance Step C2 A1 S2
  someStep (S2_ s) = SomeStep s
  someStep _ = error "bad step for C2/A1 combination"

instance ToSomeStep C1 A2 where
  -- instance Step C1 A2 S3
  someStep (S3_ s) = SomeStep s
  someStep _ = error "bad step for C1/A2 combination"

and using it in parseRequest like so:

parseRequest :: String -> SomeRequest
parseRequest inp
  = case (parseC inp, parseB inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1) -> makeRequest c1 b1
      -- instance Base C2 B1 A1
      (C2_ c2, B1_ b1) -> makeRequest c2 b1
      -- instance Base C1 B2 A2
      (C1_ c1, B2_ b2) -> makeRequest c1 b2
      (_, _) -> error "incompatible environment/base combination"
  where makeRequest :: (Show a, Base c b a, ToSomeStep c a) => c -> b -> SomeRequest
        makeRequest c b = SomeRequest (Env c) b (someStep <$> parseSList inp)

As I say, I can't see any way to avoid enumerating all valid c/b combinations in parseRequest or enumerating all c/a/s combinations in someStep. In parseRequest, while you can't make the mistake of trying to handle an invalid c/b combination without a type error, it's still possible to miss a valid combination. Similarly for ToSomeStep, the type checker will prevent you from trying to allow an invalid step for a particular c/a combination, but it can't help you if you miss valid steps.

Anyway, the following complete program type-checks:

{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE GADTs #-}

import Control.Monad.RWS

type Log = ()
type Plan c a = RWS (Env c) Log (State a)
newtype Env c = Env c deriving (Show)
newtype State a = State a
class Base c b a | b -> a where
  execBase :: Env c -> b -> State a
class Step c a s where
  defineStep :: s -> Plan c a ()

data SomeRequest where
  SomeRequest :: (Show a, Base c b a) => Env c -> b -> [SomeStep c a] -> SomeRequest
data SomeStep c a where
  SomeStep :: (Step c a s) => s -> SomeStep c a
data SomeResult where
  SomeResult :: (Show a) => a -> SomeResult

runRequest :: SomeRequest -> SomeResult
runRequest (SomeRequest e b ss) = SomeResult $ execPlan e b (mapM_ runStep ss)

runStep :: SomeStep c a -> RWS (Env c) Log (State a) ()
runStep (SomeStep s) = defineStep s

execPlan :: (Base c b a) => Env c -> b -> Plan c a () -> a
execPlan e b p = case execRWS p e (execBase e b) of (State a, _) -> a

-- environments (c)
data C1 = C1 Int
data C2 = C2 String

-- bases (b)
data B1 = B1 Double
data B2 = B2 ()

-- steps (s)
data S1 = S1 Double
data S2 = S2 (Maybe Double)
data S3 = S3 ()

-- results (a)
data A1 = A1 Char deriving (Show)
data A2 = A2 Double deriving (Show)

-- valid base instances
instance Base C1 B1 A1
instance Base C2 B1 A1
instance Base C1 B2 A2

-- valid step instances
instance Step C1 A1 S1
instance Step C1 A1 S2
instance Step C2 A1 S2
instance Step C1 A2 S3

-- parse environment
data SomeC = C1_ C1 | C2_ C2
parseC :: String -> SomeC
parseC = undefined

-- parse base
data SomeB = B1_ B1 | B2_ B2
parseB :: String -> SomeB
parseB = undefined

-- parse list of steps
data SomeS = S1_ S1 | S2_ S2 | S3_ S3
parseSList :: String -> [SomeS]
parseSList = undefined

class ToSomeStep c a where
  someStep :: SomeS -> SomeStep c a

instance ToSomeStep C1 A1 where
  -- instance Step C1 A1 S1
  someStep (S1_ s) = SomeStep s
  -- instance Step C1 A1 S2
  someStep (S2_ s) = SomeStep s
  someStep _ = error "bad step for C1/A1 combination"

instance ToSomeStep C2 A1 where
  -- instance Step C2 A1 S2
  someStep (S2_ s) = SomeStep s
  someStep _ = error "bad step for C2/A1 combination"

instance ToSomeStep C1 A2 where
  -- instance Step C1 A2 S3
  someStep (S3_ s) = SomeStep s
  someStep _ = error "bad step for C1/A2 combination"

parseRequest :: String -> SomeRequest
parseRequest inp
  = case (parseC inp, parseB inp) of
      -- instance Base C1 B1 A1
      (C1_ c1, B1_ b1) -> makeRequest c1 b1
      -- instance Base C2 B1 A1
      (C2_ c2, B1_ b1) -> makeRequest c2 b1
      -- instance Base C1 B2 A2
      (C1_ c1, B2_ b2) -> makeRequest c1 b2
      (_, _) -> error "incompatible environment/base combination"
  where makeRequest :: (Show a, Base c b a, ToSomeStep c a) => c -> b -> SomeRequest
        makeRequest c b = SomeRequest (Env c) b (someStep <$> parseSList inp)

main = do
  let r = parseRequest "<some JSON input>"
      result = runRequest r
  case result of SomeResult r -> print r