2
votes

I have a strange whim. Suppose I have something like this:

data Statement = StatementType Stuff Source

Now I want to parse such a statement, parse all the stuff, and after that I want to put all characters that I've processed (for this particular statement) into resulting data structure. For some reason.

Is it possible, and if yes, how to accomplish that?

1
You tagged it with parsec, so I assume you are parsing it with parsec? How about you give a concrete example of your parser.Joachim Breitner
@JoachimBreitner, yes, I want to use Parsec. I'm sorry there is no example yet, I'm on planning stage. But it should be nothing special: language definition, token parser, then many little functions inside Parser monad that extract tokens one by one and return some data structures.Mark Karpov

1 Answers

4
votes

In general this is not possible. parsec does not expect a lot from its stream type, in particular there is no way to efficently split a stream.

But for a concrete stream type (e.g. String, or [a], or ByteString) a hack like this would work:

parseWithSource :: Parsec [c] u a -> Parsec [c] u ([c], a)
parseWithSource p = do
    input <- getInput
    a <- p
    input' <- getInput
    return (take (length input - length input') input, a)

This solution relies on function getInput that returns current input. So we can get the input twice: before and after parsing, this gives us exact number of consumed elements, and knowing that we can take these elements from the original input.

Here you can see it in action:

*Main Text.Parsec> parseTest (between (char 'x') (char 'x') (parseWithSource ((read :: String -> Int) `fmap` many1 digit))) "x1234x"
("1234",1234)

But you should also look into attoparsec, as it properly supports this functionality with the match function.