1
votes

I'm aware of ways to keep binary search trees balanced/self-balancing using rotations.

I am not sure if my case needs to be that complicated. I don't need to maintain any sorted order property like with self-balancing BSTs. I just have an ordinary binary tree that I may need to delete nodes or insert nodes. I need try to maintain balance in the tree. For simplicity, my binary tree is similar to a segment tree, and every time a node is deleted, all the nodes along the path from the root to this node will be affected (in my case, it's just some subtraction of the nodal values). Similarly, every time a node is inserted, all the nodes from the root to the inserted node's final location will be affected (an addition to nodal values this time).

What would be the most straightforward way to keep a tree such as this balanced? It doesn't need to be strictly as height balanced as AVL trees, but something like RB trees or maybe slightly less balanced is acceptable as well.

2
Do you mean that if a node is inserted, it can be inserted at any spot of our own choice?trincot
@trincot Yes, for my use case it can be inserted at any spot so long as we don't have an outrageously unbalanced tree. I don't have a good definition of "outrageous unbalanced." Ideally, I just want it to be as balanced as possible.user5965026

2 Answers

1
votes

If a new node does not have to be inserted at a particular spot -- possibly determined by its own value and the values in the tree -- but you are completely free to choose its location, then you could maintain the shape of the tree as a complete tree:

In a complete binary tree every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.

An array is a very efficient data structure for a complete tree, as you can store the nodes in their order in a breadth-first traversal. Because the tree is given to be complete, the array has no gaps. This structure is commonly used for heaps:

Heaps are usually implemented with an array, as follows:

  • Each element in the array represents a node of the heap, and
  • The parent / child relationship is defined implicitly by the elements' indices in the array.

Example of a complete binary max-heap with node keys being integers from 1 to 100 and how it would be stored in an array.

In the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index i, its children are at indices 2i + 1 and 2i + 2, and its parent is at index floor((i-1)/2). This simple indexing scheme makes it efficient to move "up" or "down" the tree.

Operations

In your case, you would define the insert/delete operations as follows:

  • Insert: append the node to the end of the array. Then perform the mutation needed to its ancestors (as you described in your question)
  • Delete: replace the node to be deleted with the node that currently sits at the very end of the array, and shorten the array by 1. Make the updates needed that follow from the change at these two locations -- so two paths from root-to-node are impacted.
0
votes

When balancing non-BSTs, the big question to ask is

Can your tree efficiently support rotations?

Some types of binary trees, like k-d trees, have a specific layer-by-layer structure that makes rotations infeasible. Others, like range trees, have auxiliary metadata in each node that's expensive to update after a rotation. But if you can handle rotations, then you can use just about any of the balancing strategies out there. The simplest option might be to model your tree on a treap: put a randomly-chosen weight field into each node, and then, during insertions, rotate your newly-added leaf up until its weight is less than its parent. To delete, repeatedly rotate the node with its lighter child until it's a leaf, then delete it.

If you cannot support rotations, you'll need a rebalancing strategy that does not require them. Perhaps the easiest option there is to model your tree after a scapegoat tree, which works by lazily detecting a node that's too deep for the tree to be balanced, then rebuilding the smallest imbalanced subtree possible into a perfectly-balanced tree to get everything back into order. Deletions are handled by rebuilding the whole tree once the number of nodes drops by some constant factor.