5
votes

I have no experience in Haskell. I'm trying to parse many .json files to a data structure in Haskell using aeson. However, by reasons beyond my control, I need to store the name of the file from where the data was parsed as one of the fields in my data. A simple example of what I have so far is:

data Observation = Observation { id :: Integer
                               , value :: Integer
                               , filename :: String}

instance FromJSON Observation where
  parseJson (Object v) =
    Observation <$> (read <$> v .: "id")
                <*> v .: "value"
                <*> ????

My question is: what is a smart way to be able to serialize my data when parsing a json file having access to the name of the file?

What comes in my mind is to define another data like NotNamedObservation, initialize it and then having a function that converts NotNamedObservation -> String -> Observation (where String is the filename) but that sounds like a very poor approach.

Thanks.

2
I think your idea of defining an extra datatype representing an "observation without a filename" is a very good one!Benjamin Hodgson
It may not seem terribly clever, but I also think your suggested solution is the best one. Particularly if it had the signature NotNamedObservation -> FilePath -> Observation, then it's obvious what's going on.Jordan Mackie

2 Answers

2
votes

When you don't control the data definition and you have strict requirements about the format to parse, it's better to write the (de)serializer explicitly.

If external information is required to fully construct values, avoid the FromJSON/ToJSON type classes, and just write standalone parsers.

aeson's deriving mechanism is more suited to applications that talk to themselves (and thus only care about round-tripping between parseJSON and toJSON), or where there is flexibility either in defining the JSON format or the Haskell types.


If you still have to use these classes for some reason, one option is of course to just put undefined in those missing fields. To rely on the type system more, you can also parameterize types by a "phase" (that assumes again you can tweak the data type), which is a type constructor that wraps some fields.


data Observation' p = Observation
  { id :: Integer
  , value :: Integer
  , filename :: p String }

-- This is isomorphic to the original Observation data type
type Observation = Observation Identity

-- When we don't have the filename available, we keep the field empty with Proxy
instance FromJSON (Observation' Proxy) where
  ...

mkObservation :: FileName -> Observation' Proxy -> Observation
2
votes

Just make your instance a function from file path to Observation:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as LBS
import System.Environment

data Observation = Observation { ident    :: Integer
                               , value    :: Integer
                               , filename :: FilePath
                               } deriving (Show)

instance FromJSON (FilePath -> Observation) where
  parseJSON (Object v) =
    do i <- read <$> v .: "id"
       l <- v .: "value"
       pure $ Observation i l

main :: IO ()
main = do
  files <- getArgs
  fileContents <- traverse LBS.readFile files
  print fileContents
  let fs = map (maybe (error "Invalid json") id . decode) fileContents
      jsons :: [Observation]
      jsons = zipWith ($) fs files
  print jsons