1
votes

I currently have a health injury data set of scores 0-6, where 0 is no injury and 6 is fatal injury. This is across 6 categorical body region variables. I'm attempting to construct an Abbreviated Injury Scale, where the three highest scores in an observation would be considered for the calculations. How do I filter the three highest in each row in SAS? Below is an example:

ID A B C D E F

1 0 0 0 3 4 0

2 1 2 1 4 0 0

3 0 0 5 0 0 0

4 1 2 1 5 4 0

So in OBS 1, scores 3, 4, and 0 would be used; OBS 2 - 4, 2, and 1; OBS 3 - 5, 0, and 0; OBS 4 - 5, 4, 2.

1
Or Largest Function.Haikuo Bian
Or transpose to have 1 record per body region (6 records per subject ID).Quentin

1 Answers

0
votes

I've provided code below to do what you asked, and detailed out the steps enough that you should be able to modify it for many options/uses.

Basically, it takes your data, transposes it as Quentin suggested and then uses proc means to output the top 3 observations for each ID.

DATA NEW;
INPUT ID A B C D E F;
CARDS;
1 0 0 0 3 4 0
2 1 2 1 4 0 0
3 0 0 5 0 0 0
4 1 2 1 5 4 0
RUN;

PROC TRANSPOSE DATA=NEW OUT=T_OUT(RENAME=(_NAME_ = VARIABLE COL1=VALUES));
BY ID;
VAR A B C D E F;
PROC PRINT DATA=T_OUT;
RUN;

PROC MEANS DATA=T_OUT NOPRINT;
CLASS ID;
TYPES ID;
VAR VALUES;
OUTPUT OUT=TOP3LIST(RENAME=(_FREQ_=RANK VALUES_MEAN=INDEX_CRITERIA))SUM= MEAN=
    IDGROUP(MAX(VALUES) OUT[3] (VALUES VARIABLE)=)/AUTOLABEL AUTONAME;
PROC PRINT DATA=TOP3LIST;
RUN;

***THEN YOU CAN MERGE THIS DATA SET TO YOUR ORIGINAL ONE BY ID TO GET YOUR INDEX CRITERIA ADDED TO IT***;
***THE INDEX_CRITERIA IS A MEAN FROM PROC MEANS BEFORE THE KEEPING OF JUST THE TOP3 VALUES***;
DATA FINAL (DROP=_TYPE_ RANK VALUES_Sum VALUES_1 VALUES_2 VALUES_3 VARIABLE_1 VARIABLE_2 VARIABLE_3);
MERGE NEW TOP3LIST;
INDEX_CRITERIA2=SUM(VALUES_1, VALUES_2, VALUES_3)/3; *THIS CRITERIA IS AVERAGE OF THE KEPT 3 VALUES;
BY ID;
PROC PRINT DATA=FINAL;
RUN;

Best regards, john