0
votes

For example, I want to create a new dataset (Data2) from Data1.

A new variable, cost in data2 is calculated as sum of multiple observation by ID in material of data1.

(Data1)

ID  material
1   4
1   4
1   4
2   2
2   4
2   4
3   2
3   6
3   6
4   5
4   5
4   5
4   5
5   2
5   4
5   4
5   8

(Data2)

ID  cost    
1   12  #4+4+4
2   10  #2+4+4
3   14  #2+6+6
4   20  #5+5+5+5
5   18  #2+4+4+8

I have used SAS EG version only for simple analysis, and recently I started to use proc sql procedure. As a beginner in SAS coding (proc sql), it was very hard to approach the answer, for myself. Thank you very much, in advance.

2
Hint: GROUP BY.Gordon Linoff

2 Answers

1
votes

Base SAS has several procedures that will present aggregated values over a group. MEANS, SUMMARY, and reporting procedures such as REPORT and TABULATE. The procedures can also save output data sets containing the computed aggregates.

data have; input
ID  material_cost;datalines;
1   4
1   4
1   4
2   2
2   4
2   4
3   2
3   6
3   6
4   5
4   5
4   5
4   5
5   2
5   4
5   4
5   8
run;

title "Proc MEANS";
proc means data=have sum noNobs maxdec=0;
  class id;
  var material_cost;
run;

title "Proc SUMMARY";
proc summary data=have print sum noNobs maxdec=0;
  class id;
  var material_cost;
run;

title "Proc REPORT";
proc report data=have;
  columns id material_cost;
  define id / group;
run;

title "Proc TABULATE";
proc tabulate data=have;
  class id;
  var material_cost;
  table id, material_cost*sum / NoCellMerge;
run;
0
votes

If you want to use PROC SQL, this is a straight forward use of GROUP BY

proc sql;
  select id, sum(material) as sum from mydataset group by id;
  quit;

You could manually compute this in a datastep also if you don't want to use PROC SQL

proc sort data=mydataset;
  by id;
run;

data sums;
  set mydataset;
  by id;

  if first.id then sum = 0;
  sum + material;
  if last.id then output;

  keep id sum;

run;

proc print data=sums;
run;