Suppose that I have a long dataset in Stata, categorized by vartype
; where vartype
is in the range of A to D.
list var1 var2 var3 vartype in 1/10
+--------------------------------------+
| var1 var2 var3 vartype |
|--------------------------------------|
1. | 1:Yes 1 900000 A |
2. | 1:Yes 1 0 C |
3. | 1:Yes 1 0 A |
4. | 1:Yes 1 1000000 D |
5. | 1:Yes 1 8000000 C |
|--------------------------------------|
6. | 1:Yes 1 3100000 C |
7. | 1:Yes 1 0 B |
8. | 1:Yes 1 4000000 A |
9. | 1:Yes 1 . A |
10. | 1:Yes 1 1.00000e+12 B |
+--------------------------------------+
I want to reshape
it into wide and rename
each original variable (var1 var2 var3
) into different names (say inpatient outpatient cost
). I also want for each code of vartype
(A to D) into a different category (chol diab hyper cancer
) after doing reshape
.
For example, after reshape wide
, I will get var01A, var01B, var01C
, etc. and want to rename them as inpatient_chol, inpatient_diab, inpatient_hyper
, etc. This should also applied for other variables; var2 = outpatient
and var3 = cost
.
For now, all I know is to do these lines below while I am looking for another way(s) such as nested loop or maybe even simpler codes.
reshape wide var1 var2 var3, i(hhid pid) j(vartype) string
foreach y in var1 var2 var3 {
rename `y'A `y'cholesterol
rename `y'B `y'diabetes
rename `y'C `y'hypertension
rename `y'D `y'cancer
}
}
foreach x in cholesterol diabetes hypertension cancer {
rename var1`x' has_`x'
rename var2`x' inpatient_`x'
rename var3`x' cost_`x'
}
I know I can rename
and recode
each variable and each vartype
before reshaping it into wide
. I just want to know if there's another way for a wide dataset.
rename
allows many different syntaxes. Off-hand I don't see that any offers much improvement on what you already have. – Nick Cox