0
votes

I'm using the hierarchy.to_tree from scipy, and I'm interested in getting a print out of all root-to-leaf paths:

enter image description here

10.8.3 10.8.5 10.2.2

from scipy.cluster import hierarchy
a = hierarchy.to_tree(linkage_matrix)

I've given it a try

linkage_matrix
[[2, 3, 0.06571365, 2], [0, 10, 0.07951425, 2], [5, 6, 0.09405724, 2], [11, 13, 0.10182075, 3], [1, 12, 0.12900146, 3], [14, 15, 0.13498948, 5], [8, 9, 0.16806049, 2], [7, 16, 0.1887918, 4], [17, 19, 0.2236683, 9], [18, 20, 0.29471335, 11], [4, 21, 0.45878, 12]]

from scipy.cluster import hierarchy
a = hierarchy.to_tree(linkage_matrix)

def parse_tree(tree, path):
    path = path
    if path ==[]:
        path.append(str(tree.get_id()))
    if tree.is_leaf() is False:
        left = tree.get_left()
        left_id = str(left.get_id())
        if left.is_leaf() is False:
            path.append(left_id)
            parse_tree(left, path)
            path.pop()
        else:
            parse_tree(left, path)
        right = tree.get_right()
        right_id = str(right.get_id())
        if right.is_leaf() is False:
            path.append(right_id)
        parse_tree(right, path)
    else:
        path.append(str(tree.get_id()))
        print(('.').join(path)) 
        path.pop()

parse_tree(a, [])

But obviously my logic is completely wrong, specifically it breaks down when the left node is not a leave (22.21.20.17.15.19.7 should be 22.21.20.19.7). I'm looking for new ways, I have not considered.

For the below example tree, all root-to-leaf paths are:

1

1 Answers

3
votes

Without looking at your code, you should be doing something like:

print_paths(tree, seen):
    seen = seen
    seen.append(tree.value)
    if not tree.children:
        print(seen)
    else:
        map(print_paths, tree.children)

Having now seen your code, try something like:

def parse(tree, p):
    path = p[:]
    path.append(str(tree.get_id()))
    if tree.is_leaf():
        print('.'.join(path))
    else: 
        #Here I assume get_left() returns some falsey value for no left child
        left = tree.get_left()
        if left:
            parse(left, path)
        right = tree.get_right()
        if right:
            parse(right, path)