0
votes

I would like to create from some start and end dates dummy variables which take value 1 if in the range. For example, from

id    start     end
1     01072014  05072014
1     05012014  06012015

I would like to get

id    start     end       d_01012014  d_02012014 d_03012014 ... d_01052014 ... d_31122014
1     01012014  02012014  1           1          0              0              0
1     01052014  02052015  0           0          0              1              0

So that I eventually can reshape long my data, dropping all observations out of dayrange. My idea was to use a loop with stata date format, somethin like this:

foreach i in *stataformat startdate*/*stataformat enddate* {
generate d_`i'=1 if `i'>=start & `i'<=end
}

But the problem from this method is that my variables would alle have incomprensible names. So do you either suggest another approach, or have an idea how to rename variables containing stata datecodes to 'understandable' names? Thanks a lot!

1
Could you give a complete macro? I don't know what you put in for the stataformat startdate. A potential workaround is to change your dates into "number of days/weeks elapsed from a reference date." And another one (I haven't verified) is to extract the month, day, and year from each i, and concat them together as your variable name.Penguin_Knight
The internal Stata format for dates (not date and time) is days in 1-1-1960, so that is already in the form sugested by @Penguin_KnightMaarten Buis

1 Answers

5
votes

If I wanted to do this from first principle I would start with long format data:

clear
input id  spell  str10 start   str10 end
      1   1      "01-07-2014"  "05-07-2014"
      1   2      "06-08-2014"  "06-01-2015"
end

gen start2 = date(start, "MDY")
gen end2 = date(end, "MDY")

format start2 %td
format end2 %td

sum start2
local min = r(min)
sum end2
local range = r(max) - `min' + 1

expand `range'
bys id spell : gen date = `min' + _n - 1
format date %td
keep if date >= start2 & date <= end2

However, since this is probably survival analysis data, and you already stset the dataset (or you are going to), you can just use stsplit.