16
votes

I'm looking for an interval tree algorithm similar to the red-black interval tree in CLR but that supports merging of intervals by default so that there are never any overlapping intervals.

In other words if you had a tree containing two intervals [2,3] and [5,6] and you added the interval [4,4], the result would be a tree containing just one interval [2,6].

Thanks

Update: the use case I'm considering is calculating transitive closure. Interval sets are used to store the successor sets because they have been found to be quite compact. But if you represent interval sets just as a linked list I have found that in some situations they can become quite large and hence so does the time required to find the insertion point. Hence my interest in interval trees. Also there may be quite a lot of merging one tree with another (i.e. a set OR operation) - if both trees are large then it may be better to create a new tree using inorder walks of both trees rather than repeated insertions of each interval.

1
I've deleted my answer since I stupidly overlooked some cases. It was still possible to fix, but would become a lot more complicated. Anyway, since you updated to say you'll mostly be merging entire trees, the answer doesn't seem useful anymore, as in-order traversal will be faster.interjay
Oh ok. Sometimes I will be merging two large trees when inorder will probably be faster, but more often I will be adding a small tree to a large tree.Dave Griffiths

1 Answers

4
votes

The problem I see is that inserting a large interval can wipe out a large chunk of the tree, making it difficult to recover the red-black invariants.

I think it would be easier to use a splay tree, as follows. For simplicity, each tree contains two sentinels, an interval A to the left of all other intervals and an interval Z to the right. When inserting an interval I, splay I's predecessor-to-be H to the root of the tree. The tree looks like

   H
  / \
...  X
    / \
  ... ...

Now detach X and splay I's successor-to-be J to the root.

   H       J
  /       / \
...     ... ...

At this point all of the intervals that overlap I are in J's left subtree. Detach that subtree and put all of its nodes on the free list, extending I if necessary. Make I the parent of H and J

     I
    / \
   H   J
  /     \
...     ...

and continue on our merry way. This operation is O(log n) amortized, where n is the number of tree nodes (this can be proved by examining the splay tree potential function and doing a lot of algebra).


I should add that there's a natural recursive tree-to-tree merge by inserting the root of one tree and then merging the left and right subtrees. I don't know how to analyze it off-hand.