1
votes

I have many datasets for many years from 2001 to 2014 which look like the following. Each year is stored in one file, yXXXX.sas7bdat,

ID Weight X1 X2 X3
1  100    1  2  4
2  300    4  3  4

and I need to create a dataset where for each year we have the (weighted) sums of each of the X columns.

X1 X2 X3 Year
10 20 30 2014
40 15 20 2013

I would be happy to implement this into a macro but I am unsure of a way to isolate column sums, and also an efficient way to attach results together (proc append?)

Edit: Including an attempt.

%macro final_dataset;

%do i = 2001 %to 2014;

/*Code here which enables me to get the column sums I am interested in.*/

proc means data = y&i;
 weight = weight;
 X1 = SUM X1;
 X2 = SUM X2;
 X3 = SUM X3;
 OUTPUT OUT = sums&i;
run;

data final;
 set final sums&i;
run;

%end;
%mend;

Edit: Another attempt.

%macro final_dataset;

%do i = 2001 %to 2014;

/*Code here which enables me to get the column sums I am interested in.*/

proc means data = y&i SUM;
 weight = weight;
 var X1 X2 X3;
 OUTPUT OUT = sums&i;
run;

data final;
 set final sums&i;
run;

%end;
%mend;

Edit: Final.

%macro final_dataset;

%do i = 2001 %to 2014;

/*Code here which enables me to get the column sums I am interested in.*/

proc means data = y&i SUM NOPRINT;
 weight = weight;
 var X1 X2 X3;
 OUTPUT OUT = sums&i sum(X1 X2 X3) = X1 X2 X3;
run;

data final;
 set final sums&i;
run;

%end;
%mend;
1
post what you've tried so far. The easiest method is to probably append all the datasets together and then use proc means on that dataset. No macros's required.Reeza

1 Answers

1
votes

This is probably what I'd do, append all the data sets together and run one proc means. You didn't mention how big the data sets are, but I'm assuming smaller data.

data combined;
length source year $50.;
set y2001-y2014 indsname=source;
*you can tweak this variable so it looks how you want it to;
year=source;
run;


proc means data=combined noprint nway;
class year;
var x1 x2 x3;
output out=want sum= ;
run;