2
votes

I have read up on red black trees, and I understand that they try to solve the problem of the tree becoming unbalanced. However, what if you were to use randomized insertion. For example:

Consider the following sorted numbers that need to be inserted:

1,2,3,4,5,6,7,8,9,10

If we naively inserted into a BST the tree would look like: 1 2 3 ..

In this case the tree will be super unbalanced and searching would be linear O(N)

However, if we randomized the insertion, it may look more balanced (but probably not balanced as a red black tree in the average case?) . If we used a red black tree, it'll guarantee a near balanced BST but with a bit of overhead. When does one draw the line of wanting this extra overhead for efficiency vs using randomized insertion BST besides for "online algorithms"( self balancing binary search trees )

2

2 Answers

2
votes

There is such line. You have always remember about motivation of using any kind of dynamic data structure (like BST and Red-Black Tree). The motivation is quite a simple one - to keep the data in some shape (ordered in case of BST). So, if you don't want keeping it you might use something like sorted array. Array sorting can be done on O(n log n). And you can do anything with sorted data (like min/max/nth) in constant time! This is extremaly fast! But, what if you're plannig to add a new value into sorted array. This is where the things is getting interesting. So, you have to shift your array and insert a new value in a proper possition. It takes O(n). And it doesn't sound like a right way. But, good news. There are trees which can handle insertion and removal in O(log n) time.

So what about line. I would say. If you're planning only insert new elements into tree. And the input values are random. So, BST is perfectly OK for this task. But, if you're planing to do some removal, that you should new self-balancing tree like RBTree, because BST unbalances due to removals (the remove operation on BST is non-symmetric and it produce the tree with height sqrt(n)).

There is a third case. You can try to find an symmetric removal algorithm for BST (stil not solved for 50 years) and be a rock-star of computer science. Goog luck with these things!

0
votes

use AVL TREE for balancing.. AVL tree is a self-balancing binary search tree, and it was the first such data structure to be invented.1 In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

Wikipedia has an article here.