5
votes

I need to give a failure message to a given position in parsec.

I tried by setting the position before giving an unexpected error message, but it didn't work:

runParser ( do pos0 <- getPosition
               id <- many1 alphaNum
               if (id == reverse id) then return id
                                     else setPosition pos0 >> unexpected id
               eof )
          () "" "abccbb"

Gives back

Left (line 1, column 7):
unexpected end of input
expecting letter or digit

While the correct response is:

unexpected abccbb
expecting letter or digit

It can be produced (with a wrong position), by omitting setPosition pos0 >> from the code.

My workaround is to do the parsing, save the correct and the actual error position in the user state of parsec, and correct the error position, but I would like a better solution.

As it was asked by AndrewC, it is part of giving error messages with more information to our users. For example, in some places we want special identifiers, but if it was encoded in the parser, parsec would given an error message like "expected a g, got an r, position is in the middle of an identifier". The correct message would be, "identifier expected in the special format, but got 'abccbb', position is before the identifier". If there is a better approach that can be used to give error messages like this, it would be a correct answer to our question. But I 'm also curious about why parsec behaves like that, and why cannot I raise a custom error message , pointing to the position I want to.

1
Digression: Is this part of some larger problem? I'm not sure that using parser like this is helping at the moment, since you're essentially just using it to do the equivalent of if all alphaNum xs then....AndrewC

1 Answers

1
votes

This is because the parser collects all errors that occurred at the furthest position in the input. When binding two parsers, any errors detected by those parsers are merged by mergeError:

mergeError :: ParseError -> ParseError -> ParseError
mergeError e1@(ParseError pos1 msgs1) e2@(ParseError pos2 msgs2)
    -- prefer meaningful errors
    | null msgs2 && not (null msgs1) = e1
    | null msgs1 && not (null msgs2) = e2
    | otherwise
    = case pos1 `compare` pos2 of
        -- select the longest match
        EQ -> ParseError pos1 (msgs1 ++ msgs2)
        GT -> e1
        LT -> e2

In your example, the many1 reaches the end-of-string, and generates an error at column 7. This error does not result in failure, but it is remembered. When you set the column back to 1, and use unexpected, it creates an error in column 1. The bind operator applies mergeError to the two errors, and the one at column 7 wins.

Using lookAhead, we can write a function isolate to run a parser p without appearing to consume any input or register any errors. The isolate parser returns a tuple containing the result of p and the parser state at the end of p so that we can jump back to that state if we so desire:

isolate :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, (State s u))
isolate p = try . lookAhead $ do
  x <- p
  s <- getParserState
  return (x, s)

With that, we can implement a palindrome parser:

palindrome = ( do
                 (id, s) <- isolate $ many1 alphaNum
                 if (id == reverse id) then (setParserState s >> return id)
                   else unexpected $ show id
             ) <?> "palindrome"

This runs the many1 alphaNum parser in an isolated context that does not appear to have consumed any input. If the result is a palindrome, we set the parser state back to where it was at the end of the many1 alphaNum and return its result. Otherwise, we report an unexpected id error, which will be registered at the position where the many1 alphaNum started.

So now,

main :: IO ()
main = print $ runParser (palindrome <* eof) () "" "Bolton"

Prints:

Left (line 1, column 1):
unexpected "Bolton"
expecting palindrome