- Question. Is there a way to define an operator such that a sequence of values separated by that operator yields a flat tuple?
I have struggled to find a concise phrasing of my question, so read on for details and examples...
Description
I'm writing myself a few helper operators for Parsec, starting with the following:
(<@) :: GenParser tok st a -> (a -> b) -> GenParser tok st b
p <@ f = do {
x <- p ;
return (f x)
}
(<>) :: GenParser tok st a -> GenParser tok st b -> GenParser tok st (a, b)
p <> q = do {
x <- p ;
y <- q ;
return (x, y)
}
These work as follows:
parseTest ((many upper) <@ length) "HELLO"
5
parseTest (many upper) <> (many lower) "ABCdef"
("ABC", "def")
Unfortunately, a sequence of parsers separated by <>
will result in a nested tuple, e.g.:
parseTest (subject <> verb <> object) "edmund learns haskell"
(("edmund", "learns"), "haskell")
Instead of the relatively more cromulent:
("edmund", "learns", "haskell")
I'm looking for a way to define <>
so that
p1 :: GenParser tok st a ; p2 :: GenParser tok st b ; p3 :: GenParser tok st c
p1 <> p2 :: GenParser tok st (a, b)
p1 <> p2 <> p3:: GenParser tok st (a, b, c)
...
I do not think I have ever seen a Haskell program where a tuple type of length n
(known at compile time) is constructed like this. And I suspect it may be difficult to define the operator with both types:
GenParser tok st a -> GenParser tok st b) -> GenParser tok st (a, b)
GenParser tok st (a, b) -> GenParser tok st c) -> GenParser tok st (a, b, c)
-- how does one tell, at compile time, the difference between a tuple resulting from <>
, and one which is simply the intended return type from any other parser? I can only speculate that additional syntax would be required.
So, I am not at all sure it's a good idea or even possible. I would be curious to know how to do it even if it is not a good idea for my case (and I would love to know how to if it's impossible!).
- Followup question (if this crazy scheme is possible). How could one annotate one item in a chain of
<>
s to be left out of the result tuple?
For example, assuming a postfix annotation <#
:
p1 :: GenParser tok st a
p2 :: GenParser tok st b
p1 <> keyword "is" <# <> p2 :: GenParser tok st (a, b)
Background
Circa 2006 I learned about parser combinators at university. We used a library in which the <@
, and I believe the <>
operator, were present and performed similarly to my attempts. I do not know what this library was; it may have been written by a graduate student for teaching our class. In any case, it does not seem to be either Parsec nor the base parser combinators in `Text.Parser.Combinators.
- Bonus question. What is the difference between the base parser combinators in
Text.ParserCombinators.ReadP
/ReadPrec
and the ones in Parsec?
I seem to recall this library was also nondeterministic, with each parser invocation returning the set of possible parses and the remaining unparsed input for each. (A successful, complete, unambiguous parse would result in [(parseresult, "")]
.)
- Final question. If this sounds like something you've heard of, could you let me know what it was (for nostalgia's sake) ?