Algorithm to find difference between two uniquely labeled general trees

Question

I'm looking for an algorithm that finds the minimal changes (insert, delete, move) to get from tree1 to tree2.

The tree is a general and uniquely labeled tree. This means that each node's value is some kind of unique identifier, like a UUID. Each node can have children and the order of those child nodes is important. Example tree 1

A
 - B
 - C
   - D
   - E
 - F

Example tree 2

A
 - B
 - F
  - C
    - D
 - G

expected changes from tree 1 to tree 2

move(atIndex: 1, inParent: A, toIndex: 0, inParent: F)
delete(atIndex: 1, inParent: C)
insert(atIndex: 2, inParent: A)

I need this for a macOS App to animated changes in an NSOutlineView. I found some algorithms to find changes between general trees, but only for ordered and labeled trees and none for uniquely labeled trees. The algorithms for non-uniquely labeled trees semes to be really complex and I thought that if the labels are unique there must be a simpler and more efficient algorithm.

BTW: NSOultineView does not have a root node but an array of nodes at the beginning. I don't know if that's important.

David Eisenstat David Eisenstat · Accepted Answer · 2020-11-26T16:30:59

You're right -- label uniqueness does simplify the problem.

For each label present in both trees, compute a sequence of operations to edit its children. This involves computing a longest common subsequence and issuing appropriate moves/inserts/deletes for all nodes not in this subsequence. (In case a label is present in both trees but with different parents, we should suppress the delete for the old parent.)

This set of operations constitutes a lower bound, but if we just do them, we may hit a situation where we try to move a node under one of its descendants. Perhaps it's enough to detect when this is about to happen and split it into two moves. If you really need the optimal sequence of moves, then I think there's a feedback arc/node set problem to be solved, though it might not be NP-hard here due to the tree structure.

Algorithm to find difference between two uniquely labeled general trees

2 Answers