3
votes

Given a binary tree the problem is to find all root-to-leaf paths. And we know the algorithm by passing the path in the form of a list and adding it to our result as soon as we hit a leaf.

My question How much space does storing all the path consumes. My intuition is that each path is going to consume memory order of the height of tree(O(h)), and if there are 2*n - 1 nodes in our full binary tree and then there are n leafs each corresponding to a path and so the space complexity would be O(n*log(n)) assuming the tree is height balanced. Is my analysis correct?

1
If there are 2^n -1 nodes then height of binary tree is log(2^n-1) . The total space complexity is O(height * num of leaf nodes) = O(n*log(2^n-1)) ~ O(n^2)Sandeep Reddy Goli
@SandeepReddyGoli: no, there are 2n-1 nodes, not 2^n-1. And if there are indeed 2^n-1 nodes, the space compexity is O(n 2^n), not O(n²). Undue upvote.Yves Daoust
Storing all the paths can be done at zero cost as the initial tree already contains them and any path can be enumerated in optimal time O(h). Even better, the complete set of paths is enumerated in time O(n).Yves Daoust
In any tree, the total path length is trivially n.h*, where h* denotes the average path length, which does not exceed the maximum path length, let h°. In a completely imbalanced tree, h* and h° are both O(n), while in a balanced tree, they are both O(Log n).Yves Daoust
@Yvues yes i was wrong.Thanks for pointing out.If the number of nodes are 2^n -1 ,then number of leaf nodes approx will be 2^(n-1) ,then total complexity is O(n*2^n)Sandeep Reddy Goli

1 Answers

2
votes

Your reasoning is correct, but it can be made more exact. A balanced binary tree is not necessarily a full binary tree.


Let N(h) be the number of paths when the height is h. Then N(h) ≤ 2 N(h - 1) This is because, given a tree of height h, the children are each trees of height at most h - 1. So

N(h) = O(2h).


Now we need to bound h. Since h appears in the exponent, it is not enough to find its order of growth. More exactly, it is known that

n ≥ 2h - 1

so

h ≤ log(n + 1)

Inserting this to what we have before

N(h) = O(2log(n + 1)) = O(n).


As you wrote the memory is the sum, per path, of the nodes on the path. The sum of nodes on each path is at most log(n + 1). Combining all the above gives O(n log(n)).