2
votes

Given the following code:

import           Data.Attoparsec.Text
import qualified Conduit as C
import qualified Data.Conduit.Combinators as CC

f :: FilePath -> FilePath -> IO ()
f infile outfile =
  runResourceT $
    CC.sourceFile infile $$ C.encodeUtf8C =$= x

where x's type is ConduitM Text Void (ResourceT IO) ()

The following compile-time error occurs in my private github repo:

• No instance for (mono-traversable-1.0.2:Data.Sequences.Utf8
                     ByteString Text)
    arising from a use of ‘C.encodeUtf8C’
• In the first argument of ‘(=$=)’, namely ‘C.encodeUtf8C’
  In the second argument of ‘($$)’, namely ‘C.encodeUtf8C =$= x’
  In the second argument of ‘($)’, namely
    ‘CC.sourceFile infile $$ C.encodeUtf8C =$= x’

How can I resolve this compile-time error?

EDIT

My understanding of the types:

> :t sourceFile
sourceFile
  :: MonadResource m =>
     FilePath
     -> ConduitM
          i bytestring-0.10.8.1:Data.ByteString.Internal.ByteString m ()

> :t ($$)
($$) :: Monad m => Source m a -> Sink a m b -> m b

> :t Conduit
type Conduit i (m :: * -> *) o = ConduitM i o m ()

> :i Source
type Source (m :: * -> *) o = ConduitM () o m ()

> :i Sink
type Sink i = ConduitM i Data.Void.Void :: (* -> *) -> * -> *

> :t (=$=)
(=$=)
  :: Monad m => Conduit a m b -> ConduitM b c m r -> ConduitM a c m r

C.encodeUtf8C =$= x boils down to, I think:

(mono-traversable-1.0.2:Data.Sequences.Utf8 text binary,
      Monad m) =>
     Conduit text m binary () 

=$= 

ConduitM Text Void binary () 

yielding a return type of

ConduitM text Void (ResourceT IO) ()

And I suppose that this type, i.e. C.encodeUtf8C =$= x, does not unify to the expected second argument of CC.sourceFile?

1
There's an instance Utf8 Text ByteString. Is something flipped around here? Do you actually need decode instead? I'm not super familiar with conduit, but what direction does your last line flow (right to left or left to right)? - Bartek Banachewicz
@BartekBanachewicz - hi. I updated my question with my understanding of the types. Thanks for your help! - Kevin Meredith
Please post the output of ghc-pkg list bytestring and ghc-pkg list mono-traversable. Also, what version of ghc are you using? Newer one should produce better error message in such cases. - Yuras
@Yuras - $ghc-pkg list bytestring /usr/local/Cellar/ghc/8.0.1_4/lib/ghc-8.0.1.20161213/package.conf.d bytestring-0.10.8.1 $ghc-pkg list mono-traversable /usr/local/Cellar/ghc/8.0.1_4/lib/ghc-8.0.1.20161213/package.conf.d. How do I tell which GHC I'm using on this stack project? - Kevin Meredith

1 Answers

2
votes

The sourceFile conduit produces a ByteString, that you need to decode into a Text for x to consume. Encoding refers to the opposite direction where you serialize Text to ByteString to be written to file.

Use decodeUtf8.


Why the types don't match

-- Ignoring the `Monad` constraint.

(=$=)      :: Conduit a m b -> ConduitM b c m r -> ConduitM a c m r

encodeUtf8 :: Utf8 text binary
           => Conduit text m binary

x          :: ConduitM Text Void m ()

To apply (=$=) to encodeUtf8, you must unify Conduit a m b and Conduit text m binary, so we get the following type equalities:

a ~ text
b ~ binary

Then we apply the result to x, unifying ConduitM b c m r and ConduitM Text Void m ():

b ~ Text
c ~ Void

Up to here the compiler doesn't complain, but we can already see a mismatch because of the two equalities involving b:

b ~ binary
b ~ Text

In the conduit-combinators library, the type variable binary is used to refer to types that represent raw binary data, typically ByteString, as opposed to more structured data like Text.

If we continue, the result has type ConduitM a c m r, and that is being passed as the second argument of ($$).

-- Expanding Source and Sink definitions, renaming type variables.

($$) :: Monad m => ConduitM () d m () -> ConduitM d Void m e -> m e

sourceFile infile
     :: _ => ConduitM i ByteString m ()

Using source infile as the first argument, we unify ConduitM () d m () with ConduitM i ByteString m ().

i ~ ()
d ~ ByteString

And with our previous encodeUtf8C =$= x as the second argument of ($$), we unify ConduitM d Void m e with ConduitM a c m r.

a ~ d
c ~ Void
r ~ e

Focus on a and d, we have the following:

a ~ text
a ~ d
d ~ ByteString

Therefore text ~ ByteString, binary ~ Text. Now remember that to use encodeUtf8, we require a Utf8 text binary constraint, i.e., Utf8 ByteString Text, which is the wrong way around.