4
votes

I'm trying to parse a binary format (PES) using Haskell:

import qualified Data.ByteString.Lazy as BL
import Data.Word
import Data.Word.Word24
import qualified Data.ByteString.Lazy.Char8 as L8

data Stitch = MyCoord Int Int deriving (Eq, Show)

data PESFile = PESFile {
      pecstart :: Word24
    , width :: Int
    , height :: Int
    , numColors :: Int
    , header :: String
    , stitches :: [Stitch]
    } deriving (Eq, Show)


readPES :: BL.ByteString -> Maybe PESFile
readPES bs =
        let s = L8.drop 7 bs
            pecstart = L8.readInt s in
            case pecstart of
        Nothing -> Nothing
        Just (offset,rest) ->   Just (PESFile offset 1 1 1 "#PES" [])

main = do
  input <- BL.getContents
  print $ readPES input

I need to read pecstart to get the offset of the other data (width,height and stiches) But this isn't working for me because I need to read a 24 bit value, and the ByteString package doesn't seem to have a 24 bit version.

Should I be using a different approach? The Data.Binary package seems good for simple formats, but I'm not sure how it would work for something like this, since you have to read a value to find the offset of the other data in the file. Something I'm missing?

1
Wouldn't you just add an instance of Binary for PESFile? The binary package looks like it would be fine, because the put/get functions are a sequence of actions (e.g. you can read pecstart to get to the next bit). - Jeff Foster
I'd love to try that approach Jeff. I've been working from the Real World Haskell chapter on binary input. If there's a tutorial on creating new Binary instances, I'd love to give it a shot. - nont
You should probably keep the header as a bytestring, for efficiency reasons. And MyCoord should use strict Int fields (e.g. !Int). - Don Stewart
If you'll be doing a lot of work with 24-bit ints, I'd recommend you look into the word24 package, hackage.haskell.org/package/word24. It provides 24-bit signed and unsigned integers with proper bounds, bit shifts, etc. There's a storable instance also, but for just reading one value from a bytestring I'd probably use Don Stewart's solution. - John L

1 Answers

6
votes

Well, you can parse a 24 bit value out by indexing 3 bytes (here in network order):

import qualified Data.ByteString as B
import Data.ByteString (ByteString, index)
import Data.Bits
import Data.Int
import Data.Word

type Int24 = Int32

readInt24 :: ByteString -> (Int24, ByteString)
readInt24 bs = (roll [a,b,c], B.drop 3 bs)
   where a = bs `index` 0
         b = bs `index` 1
         c = bs `index` 2

roll :: [Word8] -> Int24
roll   = foldr unstep 0
  where
    unstep b a = a `shiftL` 8 .|. fromIntegral b