SQL query to find all timestamps covered by an interval in A but not covered by an interval in B ("subtract" or "except" between multiple intervals)

Question

I have multiple tables in a PostgreSQL 9.4 database, where each row contains an interval as two columns "start" (inclusive) and "stop" (exclusive).

Consider the following pseudo-code (the tables are more complicated).

CREATE TABLE left (   
    start TIMESTAMP,   
    stop TIMESTAMP,   
    [...] 
);

CREATE TABLE right (
    start TIMESTAMP,   
    stop TIMESTAMP,   
    [...] 
);

The intervals are inclusive of the start, but exclusive of the stop.

I now need a query to find all possible intervals of time where there is a row in "left" covering the interval, but not simultaneously a row in "right" covering the same interval.

One interval in "left" can be cut up into any number of intervals in the result, be shortened, or be entirely absent. Consider the following graph, with time progressing from left to right:

left     [-----row 1------------------)   [--row 2--)    [--row 3----)
right  [--row1--)    [--row2--)    [--row3--)         
result          [----)        [----)        [-------)    [-----------)

In this tiny example, "left" has tree rows each representing three intervals and "right" has three rows, each representing three other intervals.

The result has four rows of intervals, which together cover all possible timestamps where there is a row/interval in "left" covering that timestamp, but not a row/interval in "right" covering the same timestamp.

The tables are of course in reality very much larger than three rows each - in fact I will frequently be wanting to perform the algorithm between two subqueries that have the "start" and "stop" columns.

I have hit a dead end (multiple dead ends, in fact), and am on the virge of just fetching all records into memory and applying some procedural programming to the problem...

Any solutions or suggestions of what thinking to apply is greatly appreciated.

Look into timestamp range type tsrange here's doc about ranges There are some useful operator for substraction, intersection etc. You could create tsrange from two values by calling ` tsrange(start_val, stop_val, '[)')`. Mayby this will help you. — Gabriel's Messanger
Are the left intervals non-overlapping? (and similar for the right intervals) — joop

klin klin · Accepted Answer · 2015-10-29T11:25:57

Change the types of columns to tsrange (or create an appropriate views):

CREATE TABLE leftr (
    duration tsrange
);

CREATE TABLE rightr (
    duration tsrange
);

insert into leftr values
('[2015-01-03, 2015-01-20)'),
('[2015-01-25, 2015-02-01)'),
('[2015-02-08, 2015-02-15)');

insert into rightr values
('[2015-01-01, 2015-01-06)'),
('[2015-01-10, 2015-01-15)'),
('[2015-01-18, 2015-01-26)');

The query:

select duration* gap result
from (
    select tsrange(upper(duration), lower(lead(duration) over (order by duration))) gap
    from rightr
    ) inv
join leftr
on duration && gap

                    result                     
-----------------------------------------------
 ["2015-01-06 00:00:00","2015-01-10 00:00:00")
 ["2015-01-15 00:00:00","2015-01-18 00:00:00")
 ["2015-01-26 00:00:00","2015-02-01 00:00:00")
 ["2015-02-08 00:00:00","2015-02-15 00:00:00")
(4 rows)

The idea:

l          [-----row 1------------------)   [--row 2--)    [--row 3----)
r        [--row1--)    [--row2--)    [--row3--)
inv(r)            [----)        [----)        [------------------------->
l*inv(r)          [----)        [----)        [-------)    [-----------)

SQL query to find all timestamps covered by an interval in A but not covered by an interval in B ("subtract" or "except" between multiple intervals)

4 Answers