1
votes

I have a SAS dataset similar to:

Joe
Joe
John
Bill
Bill
Bill
Bill
Joanne
Joanne
Fred
Fred
Fred
Ted

What I am trying to do is make it into a new dataset that has each entry only once, with an added variable that stores the "streak" of that variable. For example, the set would look like:

Joe     2
Bill    4
Joanne  2
Fred    3
Ted     1

So far I've created the first data set but am stuck on how to do the second. My idea is to create a variable in the data step that holds the value of the last observation for comparison, but I can't figure out how to do that, or then how I'd turn it into the second data set.

I'm not sure if returning to the previous observation is even possible in SAS, maybe this is a problem for a proc sql, but then I need a new data set not a table.

1

1 Answers

4
votes

If the name will only occur once, i.e. you can't have a streak thats bill, bill, joe, joe, bill, bill then use proc freq:

proc freq data=have;
table name/out=want;
run;

Otherwise use the notsorted option with a counter.

data want;
set have;

*BY tells sas my data has groups according to the "Name" variable , 
with the notsorted meaning that it isn't an alphabetical or numerical order
basically, whenever the Name changes its a new group;
by name notsorted;

*Keep the value across the different rows rather than reset;
retain counter;

*if this is the first of a group then set the counter to 1;
if first.name then counter=1;
*If this isn't the first name in the group then increment the counter,
retain means its kept the value from previous row;
else counter+1;

*If this is the last of the "Name" group then output the observation;
if last.name then output;
run;