How many times an index appear on a range of intervals?

Question

I have an array of k integer intervals in [1,n], where k ≤ 10⁵ and n ≤ 10⁵. For example, if k is 3 and n is 5, the list might be: {[1,4], [3,5], [2,3]}.

For arbitrary sub-arrays of this array of intervals, I want to be able to count how many intervals contain a given value. For example, in the above example array, I might take the sub-array from index 1 to index 2 (namely {[1,4], [3,5]}) and count the number of intervals in that sub-array contain the value 2 (just 1: [1,4]). This query would be expressed as the 3-tuple (min_index,max_index,value), in this case (1,2,2).

My idea is to build a segment tree from interval array where each node contains sorted intervals (sorted by their start index) lying on that node. Ex: Root will contain all the intervals sorted in ascending order of start index, its left child will contain sorted intervals of first half of interval array and right child will contain sorted intervals of second half and so on.

So in order to solve a query (min_index,max_index,value), I can just traverse the tree and whenever I am in a node which is completely inside [min_index,max_index] I can do a binary search over intervals and count the occurrences. In the worst case each query will take O(log^2n) operations.

Can we do better than it? Is there any way to process the data first and then answer each query in constant time?

When you say l<=index<=r, do you mean that the index of the interval in the interval array is bounded? Or that its endpoints are bounded somehow? Is index the same as idx? — nneonneo
Which programming language is this? Have you written any code so far? If this is a general question, perhaps you should try the Programmers site instead. — GoBusto
...and the contents of the first array of size N are irrelevant? How many queries are you handling? — nneonneo
@nneonneo It is just the index of the interval in the interval array. In other terms Lth to Rth interval. — Rana
Re: "Is there any way to process the data first and then answer each query in constant time?": From a CS-theory standpoint, yes: you have a finite domain, so if nothing else, you could precompute the answers to all possible queries. (Of course, this would require O(k²n) memory, so if that's the only way to do this, then it's not worth it!) — ruakh

kraskevich kraskevich · Accepted Answer · 2014-11-10T22:46:56

You can solve it much easier and faster offline using sweepline algorithm:

The events are: a query range starts, a query range ends.
Each interval adds 1 to a segment in the initial array.
To process the first type of events, you need to get the current sum in a given position.

Thus, you need a data structure that supports 2 operations: add value to a range and get a value of a single element. There are two options here:

Segment tree: it gives O((k + q) * log n) time complexity.
Sqrt-decomposition: O(sqrt(n) * k + q).

The second options gives O(1) time per query(not counting the first term as it does not depend on q).

How many times an index appear on a range of intervals?

1 Answers