I am using import Text.Parsec.Text
and import Text.Parsec.Char
to parse some data that includes integers. I am using the following code to parse integers.
p_int :: Parser Int
p_int = read <$> ((++) <$> option "" (string "-") <*> many1 digit)
I profiled my program and turns out that the above snippet takes >90% of the execution time. How do I optimize the above code?
I came across Text.ParserCombinators.Parsec.Number
module that contains an int
function to parse integers. However, its type is int :: Integral i => CharParser st i
which is not compatible with the Text
based parser I am using as evident by the error below.
• Couldn't match type ‘[Char]’ with ‘Text’
Expected type: Parser Int
Actual type: Text.ParserCombinators.Parsec.Char.CharParser () Int
UPDATE
I replaced Text.Parsec.Text
with Text.Parsec.String
and replaced my int parsing function with int
from Text.ParserCombinators.Parsec.Number
. This improved execution time by ~40%. But still the performance is worse that Python. Profiling shows that ~80% time is being consumed in int parsing. Does this mean Parsec is just slow?
COST CENTRE MODULE SRC %time %alloc
sign Text.ParserCombinators.Parsec.Number Text/ParserCombinators/Parsec/Number.hs:277:1-73 34.4 39.8
number Text.ParserCombinators.Parsec.Number Text/ParserCombinators/Parsec/Number.hs:(321,1)-(323,18) 26.7 27.5
numberValue Text.ParserCombinators.Parsec.Number Text/ParserCombinators/Parsec/Number.hs:(327,1)-(328,74) 10.2 6.7
zeroNumber Text.ParserCombinators.Parsec.Number Text/ParserCombinators/Parsec/Number.hs:(300,1)-(301,56) 6.0 10.0
...
....
int Text.ParserCombinators.Parsec.Number Text/ParserCombinators/Parsec/Number.hs:273:1-17 499 0 1.4 1.6 79.5 86.5
import Text.Parsec.String
instead ofText.Parsec.Text
here. – Willem Van Onsemread
isn't going to help you with efficiency. I would check stackoverflow.com/a/10726784/1248563 – cornuzText.Parsec.String
. Do you have any more advise please? – Random dude