I have the following dataset in Stata
Id | date1 | date2
------------------------
1 | 28mar2005| 30jun2005
2 | 28mar2005| 31dec2005
3 | 28mar2005| 28febr2005
3 | 28mar2005| 30apr2005
3 | 28mar2005| 31dec2005
The values of the second column are the same for all observations. The year considered is the same in date1 and date2.
The day in date2 is always the last day of the relative month.
The dates in date2 are not repeated and are sorted within Id.
Within Id, I want to keep the observations of the dataset such that the "time distance" between "mm" of date1 and "mm" of date2 is the smallest. In case of indifference between n observations I want to keep the last observations displayed among the n observations. Hence, I want to get
Id | date1 | date2
------------------------
1 | 28mar2005| 30jun2005
2 | 28mar2005| 31dec2005
3 | 28mar2005| 30apr2005
My idea would be
(a) generate an "absolute value distance" between date1 and date2
(b) find the minimum absolute value distance by Id
(c) If a certain Id is listed more than once, keep the observation with the minimum absolute value distance
Any suggestion?