I'm assuming that the (WardStart, WardEnd)
date ranges are strictly consecutive with no overlapping. For simplicity's sake, I'm also assuming that consecutive ranges don't exceed the max recursion default.
This can be solved using recursive SQL:
WITH
data AS (
SELECT *
FROM (
VALUES (123, 'hosp1', 'ward1', CAST('2015-04-01' AS DATE), CAST('2015-04-03' AS DATE)),
(123, 'hosp1', 'ward4', CAST('2015-04-03' AS DATE), CAST('2015-04-05' AS DATE)),
(123, 'hosp2', 'ward2', CAST('2015-04-05' AS DATE), CAST('2015-04-07' AS DATE)),
(123, 'hosp1', 'ward3', CAST('2015-04-07' AS DATE), CAST('2015-04-10' AS DATE)),
(123, 'hosp1', 'ward1', CAST('2015-04-10' AS DATE), CAST('2015-04-12' AS DATE))
) AS t(Spellno, Hospital, WardCode, WardStart, WardEnd)
),
consecutive(Spellno, Hospital, WardStart, WardEnd) AS (
SELECT Spellno, Hospital, WardStart, WardEnd
FROM data AS d1
WHERE NOT EXISTS (
SELECT *
FROM data AS d2
WHERE d1.Spellno = d2.Spellno
AND d1.Hospital = d2.Hospital
AND d1.WardStart = d2.WardEnd
)
UNION ALL
SELECT c.Spellno, c.Hospital, c.WardStart, d.WardEnd
FROM consecutive AS c
JOIN data AS d
ON c.Spellno = d.Spellno
AND c.Hospital = d.Hospital
AND c.WardEnd = d.WardStart
)
SELECT Spellno, Hospital, WardStart, MAX(WardEnd)
FROM consecutive
GROUP BY Spellno, Hospital, WardStart
ORDER BY Spellno, WardStart
Demo
Explanation
The first subquery in the recursive CTE consecutive
initialises the recursion to start with all rows for which there isn't any "previous row" for the same (Spellno, Hospital)
. This produces:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-03
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-10
The recursion then produces a new row with the previous row's WardStart
(which is always the same for consecutive rows) and the current WardEnd
. This produces:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-03 <-- Unwanted, "intermediary" row
123 hosp1 2015-04-01 2015-04-05
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-10 <-- Unwanted, "intermediary" row
123 hosp1 2015-04-07 2015-04-12
Finally, in the outer query, we select only the maximum value of WardEnd
for each consecutive series, producing the wanted result:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-05
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-12
gaps and islands
. – Vladimir Baranov