I have been reading about union-find problem. The two main improvements are path compression and union by rank. As far as I understand union by rank is used to determine how to combine disjoint trees. If we have two disjoint trees T1 and T2, then we attach the root of the tree with smaller rank to the tree with higher rank. If we don't use path compression then rank is just the depth of a tree. This makes sense since we don't want to increase the depth of out tree,since it directly affects both finds and unions.
My problem is when we use path compression as well. I keep reading that the two optimizations complement each other, but I don't see that. Because of path compression, rank is no longer depth of the tree (it becomes an upper bound on depth). Suppose T1 has 2 branches (let rank of T1 be 3), and T2 has depth 2 and rank 2. Now suppose we perform find operation (with path compression) on the leaf of T1 marked with "*" below. Now if we union root of T1 and root of T2, then T2 would be attached to the root of T1(since rank is not updated by find). The resulting tree has depth 3. But we could have had a better performance if we attached T1 to T2.
T1: o (Rank = 3) T2: o (Rank = 2)
/ \ |
o o o
| |
o o
|
*
After a find on leaf of T1("*"), then union on roots of T1 and T2 we get
T1: o (Rank = 3) T2: o (Rank = 2)
/| |\ |
* o o o o
|
o
Result of T1 union T2
o
/ | | |\
* o o o o Rank = 3 and Max Depth = 3
|
o
|
o
Am i missing something here? How do path compression and union by rank complement each other? I know rank is just an upper bound on depth of a tree but I don't see how union by rank is improving the overall performance of the structure. How is this better than a union that combines the roots randomly?
Thanks for the help in advance.