I am trying to read a large vector of custom data type from a binary file. I tried to use the example given here.
The trouble with the example code is, it uses lists and I want to use vectors. So I adapted that code as below, but it takes very long time (more than a minute, I gave up after that) to read even 1 MB file.
module Main where
import Data.Word
import qualified Data.ByteString.Lazy as BIN
import Data.Binary.Get
import qualified Data.Vector.Unboxed as Vec
main = do
b <- BIN.readFile "dat.bin" -- about 1 MB size file
let v = runGet getPairs (BIN.tail b) -- skip the first byte
putStrLn $ show $ Vec.length v
getPair :: Get (Word8, Word8)
getPair = do
price <- getWord8
qty <- getWord8
return (price, qty)
getPairs :: Get (Vec.Vector (Word8, Word8))
getPairs = do
empty <- isEmpty
if empty
then return Vec.empty
else do pair <- getPair
pairs <- getPairs
return (Vec.cons pair pairs) -- is it slow because V.cons is O(n)?
When I tried to run it with ghc --make -O2 pairs.hs
I got the error Stack space overflow: current size ...
How to efficiently read pairs of values from bytestring into vector?
Again, I wish to get complete working code not just only pointers to Haskell site or RWH nor a just list of function/module names.
V.cons
is O(n). What did you expect? A million times a million is a lot!vector
has perfectly good documentation. Note that the documentation for a Hackage module generally has a table of contents in the upper right. Check out the section on constructing vectors. – dfeuerData.Sequence
, but Ed Kmett's been working on some faster ones lately. – dfeuer