3
votes

EDIT #2: This entire question and exploration were based on my missing the fundamental notion of zippers; that they represent a perspective in a datastructure from the point of view of a particular node. So a zipper is - at all times - a pair of the current node and what the rest of the tree looks like from the perspective of that node. I was originally trying to generate a whole new structure from the zipper, while the zipper itself was all I needed, all along. I'll leave this all up for posterity, in the hope that somebody else is helped by it (or so it serves as a warning to any successors!).

Original question:

I'm trying to get my head around using zippers to manipulate trees. The specific problem is that I need to generate at runtime routes between two nodes matching arbitrary criteria in an arbitrary tree.

I thought I could use the path function to get a route to a location by calling path on the current location. But the path returned seems to omit the last step(s) required to get there.

For example:

(def test-zip (vector-zip [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]]))
(-> test-zip 
    down right right down 
    right down right right
    node)

gives 5, but

(-> test-zip 
    down right right down 
    right down right right
    path)

gives

[[0 [1] [2 [3 [4] 5 [6 [7] [8]]]]] [2 [3 [4] 5 [6 [7] [8]]]] [3 [4] 5 [6 [7] [8]]]]

which isn't the same location (it's missing the effect of the last three steps, down right right).

It looks like the path function only gets you to the parent location in the tree, ignoring any siblings between you and the actual location.

Am I missing the point of the path function? I'd assumed that given a tree and a path, applying the path to the tree would bring you to the original location of the path, not partly there.

UPDATE: I've used the following function definition to compile a path of nodes from a start location to an end location:

(defn lca-path [start-loc end-loc]
  (let [sczip (z/root start-loc)
        start-path (z/path start-loc)
        start-node (z/node start-loc)
        end-path (z/path end-loc)
        end-node (z/node end-loc)
        lca-path (filter (set start-path) end-path)
        lca-node [(last lca-path)]
        lca-to-start (conj (vec (drop (count lca-path) start-path)) start-node)
        lca-to-end (conj (vec (drop (count lca-path) end-path)) end-node)
        ]

    (concat (reverse lca-to-start) lca-node lca-to-end))
  )

Pretty heavily influenced by the chat with @Mark Fisher, thanks!

1

1 Answers

4
votes

I think you're right, it's the parent it paths to, and this is confirmed looking at this site, path returns "all subtrees from the root of the tree down to just above the current loc".

For your structure, the tree is:

   A
 / | \
0  o  B
   | / \
   1 2  C __
       /|\  \
      3 o 5  o
        |   /|\
        4  6 o o
             | |
             7 8

So the 3 nodes marked A, B, C are the parent nodes what lead to the parent of 5, which are:

A: [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]]
B: [2 [3 [4] 5 [6 [7] [8]]]]
C: [3 [4] 5 [6 [7] [8]]]

which is the result you're getting.

EDIT: I created this function to search between two values in a vector and return the path between them. Does this help? I'm not an expert on zippers so this is learning for me too.

(defn between [v s e]
  (let [zv (zip/vector-zip v)
        find-loc (fn [zv l]
                   (loop [loc zv]
                     (if (= l (zip/node loc))
                       loc 
                       (if (zip/end? loc)
                           nil
                           (recur (zip/next loc))))))
        sv (zip/vector-zip (-> (find-loc zv s) zip/up zip/node))
        e-in-sv (find-loc sv e)
        path-s-e (when (some? e-in-sv)
                   (zip/path e-in-sv))]
    path-s-e))

It finds the path between the parent of the start value and parent of end value, and you can use it like:

(def sov [0 [1] [2 [3 [4] 5 [6 [7] [8]]]]])
(between sov 2 6)
=> [[2 [3 [4] 5 [6 [7] [8]]]] [3 [4] 5 [6 [7] [8]]] [6 [7] [8]]]

I'm walking the tree to find the parent of the starting value, and then create a new zipper from its parent so that the path is limited from start point. The helper function "find-loc" returns a zipper for the entry you're looking for, e.g. "5" within a vector. sv is then the parent vector containing the start value.

Where there is no path it returns nil, e.g. between 1 and 4 you'd have to traverse up 2 elements.

This was in response to your comments of having to find the value in the final node (e.g. 5 in [3 [4] 5 [6 [7] [8]]]), which i don't think you need to do, but without the real example of what you're doing is difficult to comment on.

BTW, there may be much better ways of traverse/finding values in the vector than the custom function, but as I say zippers are pretty new for me too.