One way is to use SQL to compute count (distinct FIRMName)
within a group and then Proc REPORT
or Proc TABULATE
to present the distinct counts as columns.
Your data does not appear to model a scenario in which a firm can have multiple codes within a year; if so, the need for distinct
is not specifically needed.
Example:
Compute the number of firms with a code within a year, maintaining a categorical data structure. Present the counts in a wide layout presentation. Also, show a data transformation from two level category into a wide data set (generally not recommended)
data have;
call streaminit (20210425);
do firmname = 'a', 'b', 'c', 'd';
do year = 2000 to 2010;
code = 10 + rand('integer', 0, 5);
output;
end;
end;
run;
proc sql;
create table counts as
select year, code, count(distinct firmname) as firm_ucount
from have
group by year, code
order by year, code;
proc tabulate data=counts;
class year code;
var firm_ucount;
table year, code * firm_ucount='' * mean='' * f=4.;
run;
proc report data=counts;
columns year firm_ucount,code;
define year / group;
define code / '' across;
define firm_ucount / '#firms with code';
run;
proc transpose data=counts out=want prefix=n_code_;
by year;
id code;
var firm_ucount;
run;
TABULATE
, count class in column-dimension
REPORT
, count variable ACROSS
TRANSPOSE
, code as part of column name