2
votes

I've got the following, which type-checks:

p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' '))

Now, as the function name implies, I want it to give me an Int. But if I do this:

p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' ')) :: Int

I get this type error:

Couldn't match expected type `Int' with actual type `f0 b0'
In the return type of a call of `liftA'
In the expression:
    liftA read (many (char ' ') *> many1 digit <* many (char ' ')) ::
      Int
In an equation for `p_int':
    p_int
      = liftA read (many (char ' ') *> many1 digit <* many (char ' ')) ::
          Int

Is there a simpler, cleaner way to parse integers that may have whitespace? Or a way to fix this?

Ultimately, I want this to be part of the following:

betaLine = string "BETA " *> p_int <*> p_int  <*> p_int <*>
           p_int <*> p_parallel <*> p_exposure <* eol

which is to parse lines that look like this:

BETA  6 11 5 24 -1 oiiio

So I can eventually call a BetaPair constructor which will need those values (some as Int, some as other types like [Exposure] and Parallel)

(if you're curious, this is a parser for a file format that represents, among other things, hydrogen-bonded beta-strand pairs in proteins. I have no control over the file format!)

3

3 Answers

8
votes

How do I get Parsec to let me call read :: Int?

A second answer is "Don't use read".

Using read is equivalent to re-parsing data you have already parsed - so using it within a Parsec parser is a code smell. Parsing natural numbers is harmless enough, but read has different failure semantics to Parsec and it is tailored to Haskell's lexical syntax so using it for more complicated number formats is problematic.

If you don't want to go to the trouble of defining a LanguageDef and using Parsec's Token module here is a natural number parser that doesn't use read:

-- | Needs @foldl'@ from Data.List and 
-- @digitToInt@ from Data.Char.
--
positiveNatural :: Stream s m Char => ParsecT s u m Int
positiveNatural = 
    foldl' (\a i -> a * 10 + digitToInt i) 0 <$> many1 digit
5
votes

p_int is a parser that produces an Int, so the type would be Parser Int or similar¹.

p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' ')) :: Parser Int

Alternatively, you can type the read function, (read :: String -> Int) to tell the compiler which type the expression has.

p_int = liftA (read :: String -> Int) (many (char ' ') *> many1 digit <* many (char ' ')) :: Int

As for the cleaner ways, consider replacing many (char ' ') with spaces.

¹ ParsecT x y z Int, for example.

1
votes

You may find

Text-Megaparsec-Lexer.integer :: MonadParsec s m Char => m Integer

does what you want.

The vanilla parsec library seems to be missing a number of obvious parsers, which has led to the rise of "batteries included" parsec derivative packages. I suppose the parsec maintainers will get around to betteries eventually.

https://hackage.haskell.org/package/megaparsec-4.2.0/docs/Text-Megaparsec-Lexer.html

UPDATE

or with vanilla parsec:

Prelude Text.Parsec Text.Parsec.Language Text.Parsec.Token> parse ( integer . makeTokenParser $ haskellStyle ) "integer" "-1234"
Right (-1234)