How to get duplicate values from SAS?

Question

For a data that has the following variable: Student_ID, Start_week, Sy, Item, Type, School, Tuition and Country. If an observation has a same combination of the (Student_Id, Start_week, Sy, Item, Type and School), it is a duplicate observation.

For instance:

Student_ID Start_week Sy Item Type School
10001           1     11  101   0    2
10001           1     11  101   0    2

The two observation is a duplicate because it has same value for the combination. What I was doing was by:

proc freq data = mydataset;
 by Student_ID;
 tables Start_week Sy Item Type School;
run;

However, this didn't really help me to see what are duplicates and what are not. I wanted to create a count variable so to count duplicates, but it didn't capture the combination but just the Student_ID. Moreover, by using proc freq, it was running out of memory.

What are some effective ways to identify the duplicate values?

Reeza Reeza · Accepted Answer · 2018-09-23T01:28:33

PROC SORT has a relatively new option called NOUNIQUEKEY that will remove all unique records based on the key variables.

data h;
 input name $ age h;
 datalines;
 kir 1 1
 kir 1 2
 nir 1 3
    ;

 proc sort data=h nouniquekey out=duplicates;
 by  name age;
 run;

How to get duplicate values from SAS?

2 Answers