Typeclass constraints, polymorphism and cassava-conduit

Question

While playing with Haskell and conduit, I came across a behavior that I have a hard time explaining. First let me list all the modules and language extensions that need to be loaded to reproduce my problem:

{-# LANGUAGE FlexibleContexts  #-}

import Conduit                         -- conduit-combinators
import Data.Csv                        -- cassava
import Data.Csv.Conduit                -- cassava-conduit
import qualified Data.ByteString as BS -- bytestring
import Data.Text (Text)                -- text
import Control.Monad.Except            -- mtl
import Data.Foldable

First I created the most general CSV parsing conduit:

pipeline :: (MonadError CsvParseError m, FromRecord a)
         => ConduitM BS.ByteString a m ()
pipeline = fromCsv defaultDecodeOptions NoHeader

Then, I wanted to output the number of elements in each row of my csv file - I know this is kind of silly and useless and that there are a billion other ways of doing this kind of things, but that was just a toy test.

So I opened GHCi and tried this:

ghci> :t pipeline .| mapC length

As expected, this did not work because the constraint FromRecord a doesn't guarantee that a is Foldable. So I defined the following conduit:

pipeline2 :: (MonadError CsvParseError m, FromField a)
          => ConduitM BS.ByteString [a] m ()
pipeline2 = fromCsv defaultDecodeOptions NoHeader

This is a legal definition because FromField a => FromField [a] is an instance of FromRecord according to the cassava documentation.

At this point, I am happy and hopeful because [] is an instance of Foldable. So, once again, I open GHCi, and I try:

ghci> :t pipeline2 .| mapC length

But I get:

<interactive>:1:1: error:
    • Could not deduce (FromField a0) arising from a use of ‘pipeline2’
      from the context: MonadError CsvParseError m
        bound by the inferred type of
                 it :: MonadError CsvParseError m => ConduitM BS.ByteString Int m ()
        at <interactive>:1:1
      The type variable ‘a0’ is ambiguous
      These potential instances exist:
        instance FromField a => FromField (Either Field a)
          -- Defined in ‘cassava-0.4.5.0:Data.Csv.Conversion’
        instance FromField BS.ByteString
          -- Defined in ‘cassava-0.4.5.0:Data.Csv.Conversion’
        instance FromField Integer
          -- Defined in ‘cassava-0.4.5.0:Data.Csv.Conversion’
        ...plus 9 others
        ...plus 11 instances involving out-of-scope types
        (use -fprint-potential-instances to see them all)
    • In the first argument of ‘(.|)’, namely ‘pipeline2’
      In the expression: pipeline2 .| mapC length

So my understanding is that my pipeline2 is not enough specified.

But now if I try to forge a trivial conduit with an (almost) identical type:

pipeline3 :: (MonadError CsvParseError m, FromField a)
          => ConduitM a [a] m ()
pipeline3 = awaitForever $ \x -> yield [x]

Again I open up GHCi and try:

ghci> :t pipeline3 .| mapC length

This time I get:

pipeline3 .| mapC length
  :: (FromField a, MonadError CsvParseError m) => ConduitM a Int m ()

So this time, GHCi understands that I don't have to specify even further the definition of pipeline3.

So my question: why is there a problem with pipeline2? is there a way to define the most generic "pipeline" without further specifying the type of the output of the conduit? I thought that a list of FromField objects would be enough.

It feels like I am missing an important point about typeclasses and how to compose functions, or here Conduit objects, in a polymorphic manner.

Thank you very much for your answers!

Ben Ben · Accepted Answer · 2016-11-23T20:39:04

pipeline3 is a conduit typed like ConduitM a [a] m () (ignoring the constraints for now). So when you map length over it you get ConduitM a Int m (); the a is still there in the first type parameter, and so the FromField a constraint can remain, waiting to be instantiated at usage sites.

pipeline2 is a conduit typed like ConduitM BS.ByteString [a] m (). Now if you map length over it you would get ConduitM BS.ByteString Int m (). There's no a anywhere that type, so the FromField a instance can't be chosen at usage sites. Instead it has to be chosen immediately. But nothing in pipeline2 .| mapC length says what a should be. That's why it's complaining that a is ambiguous.

As far as I can tell (not intimately familiar with conduits), that should be the only problem with your first definition as well. FromRecord doesn't guarantee Foldable, but it has instances that are Foldable; you just need to pin down the type being used because length won't do it. You could use an expression signature on pipeline when you use it, the TypeApplication extension, a less polymorphic definition (which doesn't need to be a reimplement like pipeline2; you could have pipeline' = pipeline if you had the right signature on pipeline').

Typeclass constraints, polymorphism and cassava-conduit

2 Answers