1
votes

I have a dataset with one row per week for 2 years (so 104 rows). I have a flag column which is either 1 or 0 for each week. I want to create a new column with the following logic:

if the flag=1 for that week then have a 1 for that week and the following 3 weeks as flag_new.

My current approach, which works, is:

if flag=1 or lag(flag)=1 or lag2(flag)=1 or lag3(flag)=1 then flag_new=1;

Although this works, it becomes very tedious if I want flag_new to be 1 for the following 20 or 30 weeks instead of just 3 weeks.

I was hoping there would be an easier way to do this (perhaps a loop?), but I am not too familiar with it.

Any help is much appreciated.

2

2 Answers

2
votes

Maybe instead of a look back, think of it as a look ahead. That is, each time you see flag=1, set flag_new=1 for that record and the next three records. Something like (untested):

if flag=1 then count=3;
else count+(-1) ; *implicit retain from sum statement;

if count>=0 then flag_new=1;
0
votes

You can use a temporary array as well to keep the lagged information and then capture the highest of the array. If it's a one then you can set the new flag to 1 as well. To change the dimensions, just change the 2 to the n-1 you need.

This also demonstrates the BY statements and resetting it for the beginning of a new group.

data want;

 array p{0:2} _temporary_;

 set have;
 by object;

if first.object then call missing(of p{*});
p{mod(_n_,4)} = flag;
highest = max(of p{*});

if highest > 1 then do;
    flag_new = 1;
end;
run;