I have a list of people with medical visits on specific dates, and I've already flagged them for specific diagnoses. Now I'm trying to collapse these flags to create a file with one person/one date per line. I've found a solution that partially works, using UPDATE (found here: SAS collapse dates). However this overwrites the flagged diagnosis code with the last diagnosis code for a person on a specific date. Here is a simplified version of my data:
data have;
input id id_date diag_code $ flag;
datalines;
1 1 a .
1 1 b 1
1 1 c .
1 2 d 1
1 2 e .
1 2 f 1
2 1 g .
2 1 h .
2 1 i 1
2 2 j 1
3 1 k .
;
run;
data want;
update have (obs=0) have;
by id id_date;
run;
Output:
diag_
Obs id id_date code flag
1 1 1 c 1
2 1 2 f 1
3 2 1 i 1
4 2 2 j 1
5 3 1 k .
What I'm trying to get is:
diag_
Obs id id_date code flag
1 1 1 b 1
2 1 2 d 1
3 2 1 i 1
4 2 2 j 1
5 3 1 k .
So basically, I'd like to keep the diag_code from the first observation that has flag=1. I've tried the RENAME option to prevent overwriting the variable, but since UPDATE first reads in a dataset with 0 observations, the (new) original variable shows up but is empty:
data want;
update have (obs=0 rename=(diag_code=orig_diag_code)) have;
by id id_date;
run;
orig_
diag_ diag_
Obs id id_date code flag code
1 1 1 1 c
2 1 2 1 f
3 2 1 1 i
4 2 2 1 j
5 3 1 . k
Any ideas?
Edit: Adding where flag = 1 still erroneously shows the last diagnosis for when multiple flags occur, and does not produce observations on dates when the flag is missing:
data want;
update have (obs=0) have;
by id id_date;
where flag=1;
run;
diag_
Obs id id_date code flag
1 1 1 b 1
2 1 2 f 1
3 2 1 i 1
4 2 2 j 1
(where = (flag = 1))statement? - AdamO