How to improve the performance of Haskell IO?

Question

It seems that Haskell's IO is relatively slow.

For example, comparing Haskell with Python

#io.py
import sys
s=sys.stdin.read()
sys.stdout.write(s)

,

-- io.hs
main = do
    s <- getContents
    putStr s

Their performance (gen.py writes 512k data into stdout):

The Python version:

$ time python gen.py | python io.py > /dev/null

real    0m0.203s
user    0m0.015s
sys     0m0.000s

The Haskell version:

$ time python gen.py | runhaskell io.hs > /dev/null

real    0m0.562s
user    0m0.015s
sys     0m0.000s

It seems that the Haskell one is far lower. Is there any problem with my test? Or is it just the inherent problem of Haskell?

Thanks.

Both times include the time it takes to compile the program Try timing gen.pyc (pre-compiled) vs a precompiled binary from io.hs. — chepner

András Kovács András Kovács · Accepted Answer · 2015-06-13T09:38:32

Your example is slow because it uses lazy IO with String-s. Both have their own overheads.

In particular, String is a linked list of Char-s, therefore it has two words of space overhead for each character (one word for the constructor tag and one for the forward pointer), and each character takes up at least one word (one word for cached low characters, three words for uncached characters).

Strict IO with byte or unicode array input is much faster. Try to benchmark the following:

import qualified Data.ByteString as B

main = B.putStr =<< B.getContents

Or the following:

import qualified Data.Text as T
import qualified Data.Text.IO as T

main = T.putStr =<< T.getContents

How to improve the performance of Haskell IO?

1 Answers