Agda: parsing nested lists

Question

I am trying to parse nested lists in Agda. I searched on google and the closest I have found is parsing addressed in Haskell, but usually libraries like "parsec" are used that are not available in Agda.

So I would like to parse "((1,2,3),(4,5,6))" with a result type of (List (List Nat)).

And further nested lists should be supported (up to depth 5), e.g., depth 3 would be (List (List (List Nat))).

My code is very long and cumbersome, and it only works for (List (List Nat)) but not for further nested lists. I didn't make any progress on my own.

If helpful, I would like to reuse splitBy from the first answer of one of my older posts.

NesList : ℕ → Set
NesList 0 = ℕ -- this case is easy
NesList 1 = List ℕ -- this case is easy
NesList 2 = List (List ℕ) 
NesList 3 = List (List (List ℕ))
NesList 4 = List (List (List (List ℕ)))
NesList 5 = List (List (List (List (List ℕ)))) -- I am only interested to list depth 5
NesList _ = ℕ -- this is a hack, but I think okay for now


-- My implementation is *not* shown here
--
--
-- (it's about 80 lines long and uses 3 different functions
parseList2 : List Char → Maybe (List (List ℕ))
parseList2 _ = nothing -- dummy result


parseList : (dept : ℕ) → String → Maybe (NesList dept)
parseList 2 s = parseList2 (toList s)
parseList _ _ = nothing



-- Test Cases that are working (in my version)

p1 : parseList 2 "((1,2,3),(4,5,6))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ [])
p1 = refl


p2 : parseList 2 "((1,2,3),(4,5,6),(7,8,9,10))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ 9 ∷ 10 ∷ []) ∷ [])
p2 = refl

p3 : parseList 2 "((1),(2))" ≡ just ((1 ∷ []) ∷ (2 ∷ []) ∷ [])
p3 = refl

p4 : parseList 2 "((1,2))" ≡ just ((1 ∷ 2 ∷ []) ∷ [])
p4 = refl

-- Test Cases that are not working 
-- i.e., List (List (List Nat))

lp5 : parseList 3 "(((1,2),(3,4)),((5,6),(7,8)))" ≡ just (  ((1 ∷ 2 ∷ []) ∷ (3 ∷ 4 ∷ []) ∷ []) ∷ ((5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ []) ∷ []) ∷ [])
lp5 = refl

EDIT1 **

Connor's talk at ICFP is online -- the title is "Agda-curious?".
It is from two days ago. Check it out!!
.
See the video:
http://www.youtube.com/watch?v=XGyJ519RY6Y

--
EDIT2:
I found a link that seems to be almost the code I need for my parsing.
There is a tokenize function provided:
https://github.com/fkettelhoit/agda-prelude/blob/master/Examples/PrefixCalculator.agda

--
EDIT3:
I finally found a simple combinator library that should be fast enough. There are no examples included in the library so I still have to look how to solve the problem.
Here is the link:

https://github.com/crypto-agda/agda-nplib/blob/master/lib/Text/Parser.agda

There is more agda-code from Nicolas Pouillard online:
https://github.com/crypto-agda

My agda isn't so good, so this can probably be ignored, but shouldn't it be possible to have something like parseList' elementParser s to parse a list of items which are parsed by elementParser, and then use parseList 1 s = parseList' parseAnInt s, parseList n s = parseList' (parseList (n-1)) s. i.e. express the recursion naturally. (Similarly, couldn't NesList 0 = ℕ, NesList n = List (NesList (n-1)) be used.) — huon
Any time you want to nest a functor within itself a variable number of times, use a free monad: haskellforall.com/2012/06/… — Gabriella Gonzalez
The free monad approach is probably inappropriate to this problem. It hands the decision of how deeply nested the list^n should be to the producer of the data, when the OP seems to want the acceptable nestedness of the data to be under the control of the data consumer. The free monad on list is thus too big a type for what's specified. Junk in types is unfortunate and avoidable. This is a job for parser combinators. Nils Anders Danielsson has written a parser combinator library for Agda. It might even ship with the usual distribution. — pigworker
@pigworker Thanks for you comment. I know the parser combinators from Nils. As far as I understood it's a academic project and last time I checkt (1-2 years ago), the combinators were extremely inefficient. (About 10 characters were the maximum.) — mrsteve
I added a link to a "tokenize" function, perhaps this can help. Although I don't see the solution yet. — mrsteve

NovaDenizen NovaDenizen · Accepted Answer · 2012-09-14T19:28:39

I don't have access to an agda implementation right now, so I can't check syntax, but this is how I would address it.

First, NesList can be simplified.

NesList 0 = ℕ
NesList (succ n) = List (NesList n)

Then you need a general-purpose list parsing function. Instead of Maybe you could use List to specify alternative parses. The return value is a successful parse and the remainder of the string.

Parser : Set -> Set
Parser a = List Char -> Maybe (Pair a (List Char))

This, given a parser routine for type x, parses a parenthesis-delineated comma-separated list of x.

parseGeneralList : { a : Set } Parser a -> Parser (List a)
parseGeneralList = ...implement me!...

This parses a general NesList.

parseNesList : (a : ℕ) -> Parser (NesList a)
parseNesList 0 = parseNat
parseNesList (succ n) = parseGeneralList (parseNesList n)

Edit: As was pointed out in the comments, code using this kind of Parser won't pass agda's termination checker. I'm thinking that if you want to do parser combinators you need a Stream based setup.

Agda: parsing nested lists

EDIT1 **

https://github.com/crypto-agda/agda-nplib/blob/master/lib/Text/Parser.agda

3 Answers