Here's my plan: while parsing, we'll track which keys we've looked at. Parsers that don't consume all the keys of the object will fail. Here's your preamble, fleshed out to be complete and compilable:
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Control.Monad.Writer
import Data.Aeson
import Data.Aeson.Types (Parser)
import Data.Functor.Compose
import Data.HashMap.Lazy (HashMap)
import Data.Text (Text)
import qualified Data.HashMap.Lazy as HM
data VersionedThing = V1 Thing1 | V2 Thing2 deriving (Eq, Ord, Read, Show)
data Thing1 = Thing1
{ name :: Maybe String
, val1 :: Maybe String
, val2 :: Maybe String
} deriving (Eq, Ord, Read, Show)
data Thing2 = Thing2
{ name :: Maybe String
, val3 :: Maybe String
, val4 :: Maybe String
} deriving (Eq, Ord, Read, Show)
Now we'll add a type for parsing and tracking at the same time, together with embeddings for "just parse without tracking" and "just track without parsing".
type ParseAndTrack = Compose Parser (Writer (HashMap Text ()))
parse :: Parser a -> ParseAndTrack a
track :: Text -> ParseAndTrack ()
parse p = Compose (pure <$> p)
track t = Compose . pure . tell $ HM.singleton t ()
We can use these two primitives to lift (.:)
and (.:?)
to tracked versions of themselves. We'll use the suffix &
for things that parse and track.
(.:&) :: FromJSON a => Object -> Text -> ParseAndTrack a
o .:& t = track t *> parse (o .: t)
(.:?&) :: FromJSON a => Object -> Text -> ParseAndTrack (Maybe a)
o .:?& t = (Just <$> (o .:& t)) <|> pure Nothing
Finally, we'll give a top-level way to drop back down from "parse-and-track" mode to "parse-only" mode, failing if we haven't consumed all the keys available.
consumeAllOf :: Object -> ParseAndTrack a -> Parser a
consumeAllOf o p = do
(result, keys) <- runWriter <$> getCompose p
let unusedKeys = HM.difference o keys
unless (null unusedKeys) . fail $
"unrecognized keys " ++ show (HM.keys unusedKeys)
return result
Now we can write your two parsers with the above additional tools, and things should pretty much just work. The only difference in the parsers for Thing1
and Thing2
is that we throw a consumeAllOf
on front and use the tracking versions of .:
and .:?
in the middle.
instance FromJSON Thing1 where
parseJSON (Object v) = consumeAllOf v $ Thing1 <$> v.:& "name" <*> v.:?& "val1" <*> v .:?& "val2"
instance FromJSON Thing2 where
parseJSON (Object v) = consumeAllOf v $ Thing2 <$> v.:& "name" <*> v.:?& "val3" <*> v .:?& "val4"
instance FromJSON (VersionedThing) where
parseJSON v = (V1 <$> parseJSON v)
`mplus` (V2 <$> parseJSON v)
Try it out in ghci:
> decode "{\"name\": \"foo\", \"val1\": \"bar\"}" :: Maybe VersionedThing
Just (V1 (Thing1 {name = Just "foo", val1 = Just "bar", val2 = Nothing}))
> decode "{\"name\": \"foo\", \"val3\": \"bar\"}" :: Maybe VersionedThing
Just (V2 (Thing2 {name = Just "foo", val3 = Just "bar", val4 = Nothing}))
Thing1
orThing2
. I can pick an arbitrary default case for only having a "name". The real issue is this could grow to much more than two versions of data and I'd like something more extensible than just looking for fields that shouldn't be in some cases. So "most correct" would be whichever datatype decodes with the least number ofNothing
's. – jkeuhlenparseJSON
instances forThing1
orThing2
. My general goal is to allow multiple versions of the data to coexist so I'd prefer to not change any previous definitions. I was kind of thinking I could decode them, then re-encode intoObject
's and look at the length of theHashMap
to see the number of keys that succeeded. But that felt pretty hacky too. – jkeuhlen