As a side project I'm working on implementing my own Regex parser/compiler.
I figured that the actual matching portion would be a great time to learn how some backtracking monads work; like the list monad for example.
I ended up structuring my 'match' algorithm like so:
data Expr
= Group Int Expr
| Terms [Expr]
| OneOf [Expr]
| Repetition Expr Int (Maybe Int)
| Empty
| BackRef Int
| Wildcard
| Atom Char
| Start
| End
deriving (Show)
data RState = RState
{ groups :: Groups
, leftover :: String
} deriving Show
match :: Expr -> ListT (State RState) String
where ListT is from Gabriel's List.Transformer lib
The State monad portion tracks the groups which have been captured (so that I can use them when I match on back-references) and the remaining string left to consume. The Expr
is a data-structure representing the Regex as an AST of sorts.
Anyways; this works well; I recurse trying to make matches, if the match succeeds I return the matched portion as a result and alter the leftovers and groups in state accordingly, this ends up working in non-determinism because it will keep trying to continue using all possible previous matches when trying to process the next part of the regex. The problem is that it only backtracks on previous results, but NOT to the previous state! As a result my alternative
match code for matching chains of optionals in the regex (e.g. a|b*|c; where each part is a 'term') looks like this:
match (OneOf terms) = do
st <- get
-- Backtrack the state and try again on each alternative
let try t = put st >> match t
asum (try <$> terms)
basically I try to match on each term, but even if the match fails it still alters state! So I need to manually rewind state in between failures. This is due to the ListT implementation of <|>
:
ListT m <|> l = ListT (do
s <- m
case s of
Nil -> next l
Cons x l' -> return (Cons x (l' <|> l)) )
Where we see it runs the underlying monad and continues in the same context.
I want something similar to this:
ListT m <|> l = ListT (do
st <- get
s <- m
case s of
Nil -> put st >> next l
Cons x l' -> return (Cons x (l' <|> l)) )
... But for all monad effects in general; I'm not even sure if this is possible.
I looked into the Logic Monad as a possible solution, but even after reading the paper I can't quite tell whether it can do what I want, maybe it's possible using ifte
?
Is there a backtracking monad which not only backtracks the result of the monad, but also the monadic computations? I suppose this is impossible for things like IO, but should in theory be possible for pure monads? I guess that means it's not possible in general, but is there a type-class I could implement to gain this functionality for my case?
Thanks! I know I'm rambling out loud here, thanks for helping me think through this!
StateT RState [] String
. That way each brach should have its own state. In aListT (State RState) String
stack the state is shared across branches, as you have discovered. – danidiazIO
and the user executedlaunchNuclearMissiles china
? Would you unlaunch the missiles at China? – user253751