2
votes
data have;
input ID Herpes;
datalines;
111 1
111 .
111 1
111 1
111 1
111 .
111 .
254 0
254 0
254 1
254 .
254 1
331 1
331 1
331 1
331 0
331 1
331 1
;

Where 1=Positive, 0=Negative, .=Missing/Not Indicated

Observations are sorted by ID (random numbers, no meaning) and date of visit (not included because not needed from here forward). Once you have Herpes, you always have Herpes. How do I adjust the Herpes variable (or create a new one) so that once a Positive is indicated (Herpes=1), all following obs will show Herpes=1 for that ID?

I want the resulting set to look like this:

111 1
111 1  (missing changed to 1)
111 1
111 1
111 1  (missing changed to 1)
111 1  (missing changed to 1)
111 1
254 0
254 0
254 1
254 1  (missing changed to 1 following positive at prior visit)
254 1
331 1 
331 1
331 1
331 1  (patient-indicated negative/0 changed to 1 because of prior + visit)
331 1
331 1
3
Why keep a later positive observation if it has no meaning?Turophile
It has a meaning for the observation. Later in analyses, I want it to indicate that that individual (id) has already been identified as Herpes+ for that visit.user3566853

3 Answers

2
votes

The below code should do the trick. The trick is to use by-group processing in conjunction with the retain statement.

proc sort data=have;
  by id;
run;

data want;
  set have;
  by id;

  retain uh_oh .;

  if first.id then do;
    uh_oh = .;
  end;

  if herpes then do;
    uh_oh = 1;
  end;

  if uh_oh then do;
    herpes = 1;
  end;

  drop uh_oh;
run;
0
votes

You could create a new variable that sums the herpes flag within ID:-

proc sort data=have;
  by id;
data have_too;
  set have;
  by id;

  if first.id then sum_herpes_in_id = 0;
  sum_herpes_in_id ++ herpes;
run;

That way it's always positive from the first time herpes=1 within id. You can access these observations in other datasteps / procs with where sum_herpes_in_id;.

And for free, you also have the total number of herpes flags per id (if that's of any use).

0
votes

This can also be done in SQL. Here is an example using UPDATE to update the table in place. (This could also be done in base SAS with MODIFY.)

proc sql undopolicy=none;
update have H
    set herpes=1 where exists (
        select 1 from have V
        where h.id=v.id 
            and h.dtvar ge v.dtvar
            and v.herpes=1
    );
quit;

The SAS version using modify. BY doesn't work in a one-dataset modify for some reason, so you have to do your own version of first.id.

data have;
modify have;
drop _:;
retain _t _i;
if _i ne id then _t=.;
_i=id;
_t = _t or herpes;
if _t then herpes=1;
run;