First, define the height of a tree (as used for AVL trees):
height(leaf) = 1
height(node) = 1 + max(height(node.left), height(node.right))
Also, define the depth of a path (as used for red-black trees, a path is the chain of descendants from a given node to some leaf) to be the number of black nodes on the path.
As you point out, the tricky bit about coloring an AVL tree as a red-black tree is making sure that every path has the same depth. You will need to use the AVL invariant: that the subtrees of any given node can differ in height by at most one.
Intuitively, the trick is to use a coloring algorithm whose depth is predictable for a given height, such that you don't need to do any further global coordination. Then, you can tweak the coloring locally, to ensure the children of each node have the same depth; this is possible only because the AVL condition puts strict limits on their height difference.
This tree-coloring algorithm does the trick:
color_black(x):
x.color = black;
if x is a node:
color_children(x.left, x.right)
color_red(x): // height(x) must be even
x.color = red
color_children(x.left, x.right) // x will always be a node
color_children(a,b):
if height(a) < height(b) or height(a) is odd:
color_black(a)
else:
color_red(a)
if height(b) < height(a) or height(b) is odd:
color_black(b)
else:
color_red(b)
For the root of the AVL tree, call color_black(root)
to ensure b.
Note that the tree is traversed in depth-first order, also ensuring a.
Note that red nodes all have even height. Leaves have height 1, so they will be colored black, ensuring c. Children of red nodes will either have odd height or will be shorter than their sibling, and will be marked black, ensuring d.
Finally, to show e. (that all paths from root have the same depth),
use induction on n>=1
to prove:
- for odd
height = 2*n-1
,
- color_black() creates a red-black tree, with depth
n
- for even
height = 2*n
,
- color_red() sets all paths to depth
n
- color_black() creates a red-black tree with depth
n+1
Base case, for n = 1
:
- for odd
height = 1
, the tree is a leaf;
- color_black() sets the leaf to black; the sole path has depth 1,
- for even
height = 2
, the root is a node, and both children are leaves, marked black as above;
- color_red() sets node to red; both paths have depth 1
- color_black() sets node to black; both paths have depth 2
The induction step is where we use the AVL invariant: sibling trees can differ in height by at most 1. For a node with a given height
:
- subcase A: both subtrees are
(height-1)
- subcase B: one subtree is
(height-1)
, and the other is (height-2)
Induction step: given the hypothesis is true for n
, show that it holds for n+1
:
for odd height = 2*(n+1)-1 = 2*n+1
,
- subcase A: both subtrees have even height
2*n
- color_children() calls color_red() for both children,
- via induction hypothesis, both children have depth
n
- for parent, color_black() adds a black node, for depth
n+1
- subcase B: subtrees have heights
2*n
and 2*n-1
- color_children() calls color_red() and color_black(), resp;
- for even height
2*n
, color_red() yields depth n
(induction hyp.)
- for odd height
2*n-1
, color_black() yields depth n
(induction hyp.)
- for parent, color_black() adds a black node, for depth
n+1
for even height = 2*(n+1) = 2*n + 2
- subcase A: both subtrees have odd height
2*n+1 = 2*(n+1)-1
- color_children() calls color_black() for both children, for depth
n+1
- from odd height case above, both children have depth
n+1
- for parent, color_red() adds a red node, for unchanged depth
n+1
- for parent, color_black() adds a black node, for depth
n+2
- subcase B: subtrees have heights
2*n+1 = 2*(n+1)-1
and 2*n
- color_children() calls color_black() for both children, for depth
n+1
- for odd height
2*n+1
, color_black() yields depth n+1
(see above)
- for even height
2*n
, color_black() yields depth n+1
(induction hyp.)
- for parent, color_red() adds a red node, for depth
n+1
- for parent, color_black() adds a black node, for depth
n+2 = (n+1)+1