Old answer for old nonworking example
Which version of parsec
do you have installed? 3.1.9 does this for me:
Prelude> :m + Text.Parsec Text.Parsec.String
Prelude Text.Parsec Text.Parsec.String> :set prompt Main>
Main> let parser = choice (map (try . string) ["foo", "fob", "bar"]) :: GenParser Char st String
Main> runParser parser () "Hey" "fo "
Left "Hey" (line 1, column 1):
unexpected " "
expecting "foo", "fob" or "bar"
Main> runParser parser () "Hey" "fo"
Left "Hey" (line 1, column 1):
unexpected end of input
expecting "foo", "fob" or "bar"
The added <?> error_message
doesn't change anything except that it changes that last line to expecting expected one of ['foo', 'fob', 'bar']
.
How to extract more errors out of Parsec
So this is one of those cases where you shouldn't trust the error message to be exhaustive about the information that is available in the system. Let me give a funky Show
instance for Text.Parsec.Error:Message
(which is basically what it would be if it were deriving (Show)
) so that you can see what's coming out of Parsec:
Main> :m + Text.Parsec.Error
Main> instance Show Message where show m = (["SysUnExpect", "UnExpect", "Expect", "Message"] !! fromEnum m) ++ ' ' : show (messageString m)
Main> case runParser parser () "" "ta" of Left pe -> errorMessages pe
[SysUnExpect "\"t\"",SysUnExpect "",SysUnExpect "",Expect "\"head\"",Expect "\"tail\"",Expect "\"tales\""]
You can see that secretly choice
is dumping all of its information into a bunch of parallel messages, and storing "unexpected end-of-file" as SysUnExpect ""
. The show
instance for ParseError
apparently grabs the first SysUnExpect
but all of the Expect
messages and dumps them for you to see.
The exact function which does this at present is Text.Parsec.Error:showErrorMessages. The error messages are expected to be in order and are broken into 4 chunks based on the constructor; the SysUnExpect
chunk is sent through a special display function which hides the text completely if there are bona-fide UnExpect
elements or else shows only the first SysUnExpect
message:
showSysUnExpect | not (null unExpect) ||
null sysUnExpect = ""
| null firstMsg = msgUnExpected ++ " " ++ msgEndOfInput
| otherwise = msgUnExpected ++ " " ++ firstMsg
It may be worth rewriting this or sending a bug upstream, as this is kinda weird behavior, and the data structures don't quite suit them. First, your problem in a nutshell is: it seems like each Message
should have a SourcePos
, not each ParseError.
So, there is an earlier step, mergeErrors
, which prefers ParseErrors with later SourcePos
-es. This doesn't fire because messages don't have a SourcePos
, which means that all of the errors from choice
start at the beginning of the string rather than at the maximal point matched. You can see this for example in how this doesn't get stuck on parsing "tai"
:
let parser = try (string "head") <|> choice (map (try . (string "ta" >>) . string) ["il", "les"]) :: GenParser Char st Strinh
Second, apart from that, probably we should bind together messages that go together (so the default message is unexpected 't', expected "heads" | unexpected end-of-file, expected 'tails' | unexpected end-of-file, expected 'tales'
unless you override it with <?>
). Third, probably the ParseError constructor should be exported; fourth, the enumerated type in Message
is really ugly and might be better put into ParseError {systemUnexpected :: [Message], userUnexpected :: [Message], expected :: [Message], other :: [Message]}
or something, even in its present incarnation. (For example, the current Show
for ParseError
will break subtly if the messages aren't in a certain order.)
In the meantime I would recommend writing your own show
variant for ParseError
.
<|>
doesn't work as this is exactly whatchoice
is doing. I would need to inspect Char by Char and eliminate impossible choices on each step. I asked because it seems to be a common problem. Strange if there weren't a standard solution. – snøreventry
. – Simon Shinechoice
and also when going through the individual parsers with<|>
. Without it the failed parsers will consume input. – snøreven