I am looking for a way to produce a frequency table displaying number of unique values for variable ID
per uniqe value of variable Subclass
.
I would like to order the results by variable Class
.
Preferably I would like to display the number of unique values for ID
per Subclass
as a fraction of n for ID
. In the want-example below this values is displayed under %totalID.
In addition I would like to display the number of unique values for ID
per Subclass
as a fraction of the sum of unique ID
values found within each Class
. In the want-example below this values is displayed under %withinclassID.
Have:
ID Class Subclass
-------------------------------
ID1 1 1a
ID1 1 1b
ID1 1 1c
ID1 2 2a
ID2 1 1a
ID2 1 1b
ID2 2 2a
ID2 2 2b
ID2 3 3a
ID3 1 1a
ID3 1 1d
ID3 2 2a
ID3 3 3a
ID3 3 3b
Want:
Unique number
Class Subclass of IDs %totalID %withinclassID
--------------------------------------------------------------------
1
1a 3 100.0 50.00
1b 2 66.67 33.33
1c 1 33.33 16.67
SUM 6
2
2a 3 100.0 75.00
2b 1 33.33 25.00
SUM 4
3
3a 2 66.67 66.67
3b 1 33.33 33.33
SUM 3
My initial approach was to perform a PROC FREQ on NLEVELS producing a frequency table for number of unique IDs per subclass. Here however I lose information on class. I therefore cannot order the results by class.
My second approach involved using PROC TABULATE. I however cannot produce any percentage calculations based on unique counts in such a table.
Is there a direct way to tabulate the frequencies of one variable according to a second variable, grouped by a third variable--displaying overall and within group percentages?