0
votes

I am currently working on a dataset in SAS like this:

people - word - date - rank

A - bla - 01/01/2017 - 1

A - bla - 02/01/2017 - 2

A - test - 03/01/2017 - 3

B - bla - 01/01/2017 - 1

B - test - 09/01/2017 - 2

C - bla - 03/01/2017 - 1

C - test - 05/01/2017 - 2

C - test - 07/01/2017 - 3

C - sas - 08/01/2017 - 4

And I would like to transform it like this :

people - word - rank

A -------- bla ----- 1

A -------- test ----- 2

B -------- bla ----- 1

B -------- test ----- 2

C -------- bla ----- 1

C -------- test ----- 2

C -------- sas ----- 3

The rank is in function of the date, group by people.

I tried to use the lag function, but also syntaxes with case when (it works but I have to do this for every case and I have a maximum rank of 94... Not really easy !)

So I did not find a great way to have the last table.

Can you help me ?

Thanks a lot

1
Please post your sample data as text within your question, not as images. - user667489
I have changed my comment :) Is it better ? - chloe4
Please describe the logic of your transformation, and add the code you have tried. - Quentin
What is the meaning of maximum rank of 94? Does that mean you want a maximum of 94 observations (i.e. distinct values of WORD) for each value of PEOPLE? - Tom

1 Answers

0
votes

Whilst posting your attempted code is good protocol on this site, I don't think it would help here as lag and case when are not the way to go.

Essentially you are trying to remove duplicate entries of word and rebase your rank column. You can achieve this in a single dataset, taking advantage of first. processing, which is available when a by statement is used.

For the rank, the easiest way is to completely rebuild it from scratch as the data step moves through the records.

data have;
input people $ word $ date :ddmmyy10. rank;
format date ddmmyy10.;
datalines;
A  bla  01/01/2017  1
A  bla  02/01/2017  2
A  test  03/01/2017  3
B  bla  01/01/2017  1
B  test  09/01/2017  2
C  bla  03/01/2017  1
C  test  05/01/2017  2
C  test  07/01/2017  3
C  sas  08/01/2017  4
;
run;

data want;
set have (drop=rank date); /* remove rank as being rebuilt; date not required */
by people word notsorted; /* enable first. processing; notsorted option required as data not sorted by people and word */
if first.people then rank=0; /* reset rank when people value changes */
if first.word then do;
    rank+1; /* increment rank by 1 for the first word (will ignore subsesquent duplicates) */
    output; /* output row */
end;
run;