2
votes

I read this statement somewhere that the nodes of any AVL tree T can be colored “red” and “black” so that T becomes a red-black tree.

This statement seems quite convincing but I didn't understand how to formally proof this statement.

According to wiki, A red black tree should satisfy these five properties:

a.A node is either red or black.

b.The root is black. This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily vice versa,

c. All leaves (NIL) are black.

d.If a node is red, then both its children are black.

e.Every path from a given node to any of its descendant NIL nodes contains,the same number of black nodes.

The four conditions is quite simple, I got stuck how to proof statement 5

3

3 Answers

4
votes

First, define the height of a tree (as used for AVL trees):

height(leaf) = 1
height(node) = 1 + max(height(node.left), height(node.right))

Also, define the depth of a path (as used for red-black trees, a path is the chain of descendants from a given node to some leaf) to be the number of black nodes on the path.


As you point out, the tricky bit about coloring an AVL tree as a red-black tree is making sure that every path has the same depth. You will need to use the AVL invariant: that the subtrees of any given node can differ in height by at most one.

Intuitively, the trick is to use a coloring algorithm whose depth is predictable for a given height, such that you don't need to do any further global coordination. Then, you can tweak the coloring locally, to ensure the children of each node have the same depth; this is possible only because the AVL condition puts strict limits on their height difference.


This tree-coloring algorithm does the trick:

color_black(x):
  x.color = black;
  if x is a node:
    color_children(x.left, x.right)

color_red(x):  // height(x) must be even
  x.color = red
  color_children(x.left, x.right)  // x will always be a node

color_children(a,b):
  if height(a) < height(b) or height(a) is odd:
    color_black(a)
  else:
    color_red(a)
  if height(b) < height(a) or height(b) is odd:
    color_black(b)
  else:
    color_red(b)

For the root of the AVL tree, call color_black(root) to ensure b. Note that the tree is traversed in depth-first order, also ensuring a.

Note that red nodes all have even height. Leaves have height 1, so they will be colored black, ensuring c. Children of red nodes will either have odd height or will be shorter than their sibling, and will be marked black, ensuring d.

Finally, to show e. (that all paths from root have the same depth), use induction on n>=1 to prove:

  • for odd height = 2*n-1,
    • color_black() creates a red-black tree, with depth n
  • for even height = 2*n,
    • color_red() sets all paths to depth n
    • color_black() creates a red-black tree with depth n+1

Base case, for n = 1:

  • for odd height = 1, the tree is a leaf;
    • color_black() sets the leaf to black; the sole path has depth 1,
  • for even height = 2, the root is a node, and both children are leaves, marked black as above;
    • color_red() sets node to red; both paths have depth 1
    • color_black() sets node to black; both paths have depth 2

The induction step is where we use the AVL invariant: sibling trees can differ in height by at most 1. For a node with a given height:

  • subcase A: both subtrees are (height-1)
  • subcase B: one subtree is (height-1), and the other is (height-2)

Induction step: given the hypothesis is true for n, show that it holds for n+1:

for odd height = 2*(n+1)-1 = 2*n+1,

  • subcase A: both subtrees have even height 2*n
    • color_children() calls color_red() for both children,
    • via induction hypothesis, both children have depth n
    • for parent, color_black() adds a black node, for depth n+1
  • subcase B: subtrees have heights 2*n and 2*n-1
    • color_children() calls color_red() and color_black(), resp;
    • for even height 2*n, color_red() yields depth n (induction hyp.)
    • for odd height 2*n-1, color_black() yields depth n (induction hyp.)
    • for parent, color_black() adds a black node, for depth n+1

for even height = 2*(n+1) = 2*n + 2

  • subcase A: both subtrees have odd height 2*n+1 = 2*(n+1)-1
    • color_children() calls color_black() for both children, for depth n+1
    • from odd height case above, both children have depth n+1
    • for parent, color_red() adds a red node, for unchanged depth n+1
    • for parent, color_black() adds a black node, for depth n+2
  • subcase B: subtrees have heights 2*n+1 = 2*(n+1)-1 and 2*n
    • color_children() calls color_black() for both children, for depth n+1
    • for odd height 2*n+1, color_black() yields depth n+1 (see above)
    • for even height 2*n, color_black() yields depth n+1 (induction hyp.)
    • for parent, color_red() adds a red node, for depth n+1
    • for parent, color_black() adds a black node, for depth n+2 = (n+1)+1
1
votes

Well, simple case for #5 is a single descendant, which is a leaf, which is black by #3.

Otherwise, the descendant node is red, which is required to have 2 black descendants by #4.

Then these two cases recursively apply at each node, so you'll always have the same amount of black nodes in each path.

0
votes

It is (almost always) completely impossible to convert an AVL tree to a Red-Black tree; Of course, both are trees and the shape can be the same, but the cost is too large.

A red-black tree has an explicit black tree height, which is a global height information scattered everywhere.

An AVL tree has no global height information at all. The worst case is defined by a very simple counterexample; an almost perfect tree with one hole(one empty node).

The balance 0 means that the Nth 2-adic decision is lost. The more the tree gets balanced, the harder it gets to find the height information.

The ability to rebalance the whole tree is logically equivalent to the statement that you can find that single hole in a speed that is logically impossible. The red-black tree spends extra swapping cost to maintain the quasi-perfect black tree height, which is a global quasi-perfect balance; An AVL tree has no such global restrictions.