0
votes

I am trying to group some records to find the first one for a particular site for a given client. The problem is that the records go back and forth between sites, so I need to keep non-consecutive site date ranges separate.

Given the sample data, I want to end up with 3 records - one for site 1 starting 7/3/18, a second for site 2 starting on 9/3/18 and the third for site 1 again starting 11/3/18.

SELECT 9999 AS CLIENT_ID, 1 AS SITE_NUM, '2018-07-03' AS START_DATE, '2018-08-05' AS CREATED_DATE, 1 AS RECORD_ID
INTO #TEMP
UNION 
SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-08-01' AS CONSENT_SIGN_DATE, '2018-10-05' AS CREATED_DATE, 2
UNION 
SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-07-03' AS CONSENT_SIGN_DATE, '2018-09-22' AS CREATED_DATE, 3
UNION 
SELECT 9999 AS MEMBER_ID, 2 AS SITE_NUM, '2018-09-03' AS CONSENT_SIGN_DATE, '2018-09-05' AS CREATED_DATE, 4
UNION 
SELECT 9999 AS MEMBER_ID, 2 AS SITE_NUM, '2018-10-03' AS CONSENT_SIGN_DATE, '2018-10-05' AS CREATED_DATE, 5
UNION 
SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-11-03' AS CONSENT_SIGN_DATE, '2018-11-05' AS CREATED_DATE, 6
UNION 
SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-12-01' AS CONSENT_SIGN_DATE, '2018-12-05' AS CREATED_DATE, 7

I've been playing with ROW_NUM but haven't been able to figure out how to separate the two sets of dates for Site 1.

SELECT *, ROW_NUMBER()OVER(PARTITION BY T.CLIENT_ID, T.SITE_NUM ORDER BY T.START_DATE, T.RECORD_ID)
FROM #TEMP T 
LEFT JOIN #TEMP T2 ON T2.CLIENT_ID = T.CLIENT_ID AND T2.RECORD_ID = T.RECORD_ID - 1
ORDER BY T.RECORD_ID

How can I group the results by client and consecutive dates for a single site?

2

2 Answers

2
votes

This is a gaps-and-islands problem. For this, the difference of row numbers is the best approach:

select t.client_id, t.site_num, min(t.start_date), max(t.start_date)
from (select t.*,
             row_number() over (partition by t.client_id order by T.START_DATE, T.RECORD_ID) as seqnum_c,
             row_number() over (partition by t.client_id, t.site_num order by T.START_DATE, T.RECORD_ID) as seqnum_cs
      from #temp t
     ) t
group by client_id, site_num, (seqnum_c - seqnum_cs)
1
votes

WHat you want is consecutive rows should not have the same SITE_NUM value. All you need to do is add a where clause at the end of your query.

SELECT *, ROW_NUMBER()OVER(PARTITION BY T.CLIENT_ID, T.SITE_NUM ORDER BY T.START_DATE, T.RECORD_ID)
FROM #TEMP T 
LEFT JOIN #TEMP T2 ON T2.CLIENT_ID = T.CLIENT_ID AND T2.RECORD_ID = T.RECORD_ID - 1
ORDER BY T.RECORD_ID
WHERE T.SITE_NUM <> T2.SITE_NUM OR T2.SITE_NUM IS NULL

EDIT As suggested by @SteveB to add T2.SITE_NUM IS NULL to show last record too.