0
votes

I'm running SAS code on a data set that is thousands of rows (typical). I need to create 2 new variables in a data step that includes the sum of each row (categories by either X or Z in the title for each observation of y based on the Variable Name. Obviously I cannot write out each variable I need the sum of because it will be impossible in my actual data set. I think the answer is a Loop of sorts, but not having any luck finding a solution online where I don't need to list all of the variables.

A much smaller example data set is listed below of what I need the data to look like at the end.

So far I tried doing something like this but I KNOW this is so far off, I just am really stuck on how to get it to recognize the variables name and stop when it hits the last X or last Z.

 DATA sample1 (drop = i);
 set data; 
 do i = i to 10;
 answer = sum(i);
 end;
run

enter image description here

enter image description here

1

1 Answers

1
votes

You can use a variable short cut references with the :. of X: means sum everything that starts with the variable X.


data want;
set have;

sumx = sum(of X:);
sumZ = sum(of Z:);

*if you know the end of the series;
sumx = sum(of X1-X4);
sumZ = sum(of Z1-Z5);

run;

Different ways of specifying the variable list is illustrated here