Problem statement
Suppose I have two parsers p
and q
and I concatenate them like this:
r = try p *> q
In Parsec, the behavior of this is:
- If
p
fails without consuming input, thenr
fails without consuming input. - If
p
fails after consuming input, thenr
fails without consuming input. - if
q
fails without consuming input, thenr
fails after consumingp
. - if
q
fails after consuming input, thenr
fails after consumingp
and parts ofq
.
However, the behavior I'm looking for is a bit unusual:
- If
p
fails without consuming input, thenr
should fail without consuming input. - If
p
fails after consuming input, thenr
should fail without consuming input. - if
q
fails without consuming input, thenr
should fail without consuming input. - if
q
fails after consuming input, thenr
should fail after consuming some input.
I can't seem to think of a clean way to do this.
Rationale
The reason is that I have a parser like this:
s = (:) <$> q <*> many r
The q
parser, embedded inside the r
parser, needs a way to signal either: invalid input (which occurs when q
consumes input but fails), or end of the many
loop (which occurs when q
doesn't consume anything and fails). If the input is invalid, it should just fail the parser entirely and report the problem to the user. If there is no more input to consume, then it should end the many
loop (without reporting a parser error to the user). The trouble is that it's possible that the input ends with a p
but without any more valid q
's to consume, in which case q
will fail but without consuming any input.
So I was wondering if anyone have an elegant way to solve this problem? Thanks.
Addendum: Example
p = string "P"
q = (++) <$> try (string "xy") <*> string "z"
Test input on (hypothetical) parser s
, had it worked the way I wanted:
xyz
(accept)xyzP
(accept;P
remains unparsed)xyzPx
(accept;Px
remains unparsed;q
failed but did not consume any input)xyzPxy
(reject; parserq
consumedxy
but failed)xyzPxyz
(accept)
In the form r = try p *> q
, s
will actually reject both case #2 and case #3 above. Of course, it's possible to achieve the above behavior by writing:
r = (++) <$> try (string "P" *> string "xy") <*> string "z"
but this isn't a general solution that works for any parsers p
and q
. (Perhaps a general solution doesn't exist?)
try (try p *> q)
not work? - Mateusz Kowalczyktry
discards all information about the progress of the input (i.e. maps both kinds of failures to "failure without consuming input") so you can't emulate the behavior described with onlytry
. - Rufflewind