0
votes

The interval tree I'm currently learning has the following data structures for each node:

  • key: the median of endpoints of all the intervals
  • left, right: pointing to left and right subtrees
  • intervals: a list of intervals containing the intervals who contain key.

More accurate information can be found in this link.

I learned that the time complexity for query using interval tree is O(logn + k) where n is the number of intervals, and k is the number of reported results.

I don't quite understand the logn part. Suppose v is the current node in interval tree, and Q is the query interval. The algorithm will be something like this:

if v.key is in Q,
  add v.intervals into results
  search_on_left_subtree
  search_on_right_subtree
else if Q.right < key,
  add some intervals
  search_on_left_subtree
else
  add some intervals
  search_on_right_subtree

It seems that in the first if it's possible that we will need to go into two subtrees. Then wouldn't it be that in the worst case we may have to traverse all the tree nodes, making the time complexity O(n + k)?

1

1 Answers

0
votes

It comes from an observation that if you have an interval of length n, then we can divide it to k segments where k <= log2(n) (binary system kind of proves it)

So as we need to visit at worst log2(n) nodes: time complexity would be logarithmic.