4
votes

Here's an example to illustrate what I wanted to do:

(ns sample
  (:require [clojure.zip :as zip]
            [clojure.data.zip.xml :refer [attr text xml-> xml1->]]
            [clojure.data.xml :as xml]))

;; From https://github.com/clojure/data.zip/blob/ca5a2efcc1c865baa25f904d7d9f027809b8f738/src/test/clojure/clojure/data/zip/xml_test.clj
(def atom1 (xml/parse-str "<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns='http://www.w3.org/2005/Atom'>
  <id>tag:blogger.com,1999:blog-28403206</id>
  <updated>2008-02-14T08:00:58.567-08:00</updated>
  <title type='text'>n01senet</title>
  <link rel='alternate' type='text/html' href='http://n01senet.blogspot.com/'/>
  <entry>
    <id>1</id>
    <published>2008-02-13</published>
    <title type='text'>clojure is the best lisp yet</title>
    <author><name>Chouser</name></author>
  </entry>
  <entry>
    <id>2</id>
    <published>2008-02-07</published>
    <title type='text'>experimenting with vnc</title>
    <author><name>agriffis</name></author>
  </entry>
</feed>
"))

(def atom1z (zip/xml-zip atom1))

(defn get-entries-titles [z]
  (xml-> z :entry :title text))

(defn get-entries [z]
  (xml-> z :entry))

(defn get-titles [z]
  (xml-> z :title))

(defn f1 []
  (-> atom1z get-entries-titles))

(defn f2 []
  (-> atom1z get-entries get-titles text))

Running f1 produces an expected result:

("clojure is the best lisp yet" "experimenting with vnc")                                                                                                                                    

Running f2 throws an exception:

ClassCastException clojure.lang.LazySeq cannot be cast to clojure.lang.IFn  clojure.zip/node (zip.clj:67)

My goal was to split processing into steps:

  • Get the xml
  • Get entries from xml
  • Get title from entries

That way I can split things into separate methods. For example, I might need to have different attributes of elements that belong to different parts of the XML picked up, resulting in an output collection that is flat (e.g. take all the <id> elements from above atom1, resulting in a vector of IDs).

I want to have methods that process each type of node (in the above example, get the ID from feed and get the ID from entry) and then chain them as above. I.e. descend from the top, pick things from each of the levels, if needed call a method that further processes the children in the same fashion (using zippers).

To put it another way - I want to:

  1. Create a zipper
  2. Forward that zipper to one of processing methods
  3. Move the zipper to a specific location
  4. Process that location
  5. Process children in the same fashion (steps 2. - 5.), using the location set in step 3

However, looks like it doesn't work that way, based on the exception in f2. How can this be done? If this is not how one should use clojure.data.zip.xml, what would be the recommended one, having decomposition in mind?

1
xml-> accepts a single zipper and returns a seq of zippers. If you want to compose functions like this, you'll need to rewrite them to accept a seq of zippers.Alex
Thanks @Alex, of course that makes sense (and I even required xml1->...). Care to post as an answer that I can accept? Is this the recommended approach, any alternatives, how do you usually approach these things?levant pied

1 Answers

2
votes

I ran into the same problem. There is a very simple reason why you can't call two xml-> operators in succession. As already mentioned by Alex xml-> returns a seq. There are two answers to your question. One way to idiomatically process a tree (or XML document) is to process each level of the tree:

(map (fn [entry] (xml-> entry :title text))
 (get-entries atom1z))

If what you actually want is to combine zippers, then you would have to write a macro to construct a final zipper like the one in get-entries-titles. However, you should use a macro only if it actually helps you. Think carefully. What is it that you are missing from clojure.data.zip.xml for the purposes of processing XML?