Using notation:
H
= Balanced binary tree height
L
= Total number of leaves in a full binary tree of height H
N
= Total number of nodes in a full binary tree of height H
The relation is L = (N + 1) / 2
as demonstrated below. That would be the maximum number of leaf nodes for a given tree height H
. The minimum number of nodes at a given height is 1
(cannot be zero, because then the tree height would be reduced by one).
Drawing trees with increasing heights, one can observe that:
H = 1, L = 1, N = 1
H = 2, L = 2, N = 3
H = 3, L = 4, N = 7
H = 4, L = 8, N = 15
...
The relation between tree height (H
) and the total number of leaves (L
)
and the total number of nodes (N
) becomes apparent:
L = 2^(H-1)
N = (2^H) - 1
The correctness is easily proven using mathematical induction.
Examples above show that it is true for small H
.
Simply put in the value of H
(e.g. H=1
) and compute L
and N
.
Assuming the formulas are true for some H
, one can show they are also true for HH=H+1
:
For L
, the assumption is that L=2^(H-1)
is true.
As each node has two children, increasing the height by one
is going to replace each leaf node with two new leaves, effectively
doubling the total number of leaves. Therefore, in case of HH=H+1
,
the total number of leaves (LL
) is going to be doubled:
LL = L * 2
= 2^(H-1) * 2
= 2^(H)
= 2^(HH-1)
For N
, the assumption is that N=(2^H)-1
is true.
Increasing the height by one (HH=H+1
) increases the total number
of nodes by the total number of added leaf nodes. Therefore,
NN = N + LL
= (2^H) - 1 + 2^(HH-1)
= 2^(HH-1) - 1 + 2^(HH-1)
= 2 * 2^(HH-1) - 1
= (2^HH) - 1
Applying the mathematical induction, the correctness is proven.
H
can be expressed in terms of N
:
N = (2^H) - 1 // +1 to both sides
N + 1 = 2^H // apply log2 monotone function to both sides
log2(N+1) = log2(2^H)
= H * log2(2)
= H
The direct relation between L
and N
(which is the answer to the question asked) is:
L = 2^(H - 1) // replace H = log2(N + 1)
= 2^(log2(N + 1) - 1)
= 2^(log2(N + 1) - log2(2))
= 2^(log2( (N + 1) / 2 ))
= (N + 1) / 2
For Big O analysis, the constants are discarded, so the Binary Search Tree lookup time complexity (i.e. H
with respect to the input size N
) is O(log2(N))
. Also, keeping in mind the formula for changing the logarithm base:
log2(N) = log10(N) / log10(2)
and discarding the constant factor 1/log10(2)
, where instead of 10
one can have an arbitrary logarithm base, the time complexity is simply O(log(N))
regardless of the chosen logarithm base constant.