I have been working on this and I am not getting my dataset as expected.
I have a SAS dataset 1 given below for sample observations. This table gives the start and end ranges of dates.
Dataset 1
id start end var1
A 03/15/1992 03/20/1992 1
A 03/24/1992 03/26/1992 2
A 03/28/1992 03/31/1992 5
B 06/06/1994 06/06/1994 1
I have another SAS dataset which gives the outer ends of ranges of dates that should contain start and end ranges from dataset 1.
Dataset 2
A 01/01/1992 03/16/1992 3
A 03/17/1992 03/19/1992 4
A 03/20/1992 05/25/1992 6
B 06/06/1994 06/06/1994 8
Final table should have the exact ranges of dataset 1 and dataset 2 ranges should be adjusted so that the end dates are continuous to succeeding and preceding rows. In other words, final table should have non-overlapping intervals.
Final expected dataset
A 01/01/1992 03/14/1992 . 3
A 03/15/1992 03/20/1992 1 . <---- range stays the same as in dataset1
A 03/21/1992 03/23/1992 . 6
A 03/24/1992 03/26/1992 2 . <---- range stays the same as in dataset1
A 03/27/1992 03/27/1992 . 6
A 03/28/1992 03/31/1992 5 . <---- range stays the same as in dataset1
A 04/01/1992 05/25/1992 . 6
B 06/06/1994 06/06/1994 1 8
I am not sure if this can be achieved in SAS.
Addition to my earlier post: I have another data that need the ranges to be adjusted exactly as above but I don't have the IDs in dataset1 but rather another identifier that is common between the two dataset. Example dataset given below:
Dataset 1
id2 start end var1
A1122 03/15/1992 03/20/1992 1
A1122 03/24/1992 03/26/1992 2
A1122 03/21/1992 03/22/1992 2
A2222 03/25/1994 03/25/1994 1
A1122 01/01/1992 01/01/1992 1
Dataset 2
id1 id2 start end var2
A A1122 01/01/1992 03/16/1992 3
A A1122 03/17/1992 03/19/1992 4
A A2222 03/20/1992 05/25/1992 6
B A1122 06/06/1994 06/06/1994 8
Final dataset must retain all ranges from dataset1 exactly the way it is and ranges in dataset 2 to be adjusted for continuity.
Final dataset
id1 id2 start end var1 var2
A A1122 01/01/1992 01/01/1992 1 . <-same as dataset 1
A A1122 01/02/1992 01/14/1992 . 3
A A1122 03/15/1992 03/16/1992 1 2 <-from dataset 1 and 2
A A1122 03/17/1992 03/19/1992 1 4 <-from dataset 1 and 2
A A2222 03/20/1992 03/20/1992 1 6 <-from dataset 1 and 2
A A1122 03/21/1992 03/22/1992 2 . <-same as dataset 1
A A1122 03/23/1992 03/24/1992 . 6
A A2222 03/25/1992 03/25/1992 1 . <-same as dataset 1
B A1122 06/06/1994 06/06/1994 . 8
The ranges in dataset1 could completely overlap the ranges in dataset2 and spill to next row or lie within the ranges of dataset2. The ranges in dataset2 have no gaps between rows.
*More explanation:*The final dataset should also retain the 'end' dates of dataset2 so that the corresponding var1 would be correctly associated with the range. To achieve this, ranges in dataset 1 may have to split eg. obs. # 3,4,5 in final dataset.