1
votes

This is a newbie SAS question. I have a dataset with numerical variables v1-v120, V and a categorical variable Z(with say three possible values). For each possible value of Z, I would like to get another set of variables w1-w120, where w{i}=sum(v{i}}/V, where the sum is a sum over a given value of Z. Thus I am looking for 3*120 matrix in this case. I can do this in data step, but would like to do it by Proc SQL or Proc MEANS, as the number of categorical variables in the actual dataset is moderately large. Thanks in advance.

2

2 Answers

3
votes

Here's a solution using proc sql. You could probably also do something similar with proc means using an output dataset and a 'by' statement.

data t1;
    input z v1 v2 v3;
    datalines;
        1 2 3 4
        2 3 4 5
        3 4 5 6
        1 7 8 9
        2 4 7 9
        3 2 2 2
    ;
run;

%macro listForSQL(varstem1, varstem2, numvars);
    %local numWithCommas;
    %let numWithCommas = %eval(&numvars - 1);
    %local i;
    %do i = 1 %to &numWithCommas;
        mean(&varstem1.&i) as &varstem2.&i,
    %end;
    mean(&varstem1.&numvars) as &varstem2.&numvars
%mend listForSQL;

proc sql;
    create table t2 as
        select
            z,
            %listForSQL(v, z, 3)
        from t1
        group by z
    ;
quit;
3
votes

It's easy to do this with proc means. Using the t1 data set from Louisa Grey's answer:

proc means data=t1 nway noprint;
  class z;
  var v1-v3;
  output out=t3 mean=w1-w3;
run;

This creates an table of results that match the SQL results.