1
votes

I've got a wide dataset with each month listed as a column. I'd like to transpose the data to a long format, but the problem is that my column names will be changing in the future. What's the best way to transpose with a dynamic variable being passed to the transpose statement?

For example:

data have;
input  subject $ "Jan-10"n $ "Feb-10"n $ "Mar-10"n $;
datalines;
1   12   18  22 
2   13   19  23
;
run;

data want;
input  subject month $ value;
datalines;
1   Jan-10   12
1   Feb-10   18
1   Mar-10   22
2   Jan-10   13
2   Feb-10   19
2   Mar-10   23
;
run;
1
In HAVE the month variables are character but in WANT they are changed to numeric when transpose into VALUE. What is the actual data type in your real have? FYI by default PROC TRANSPOSE transposes all numeric variables.data _null_

1 Answers

2
votes

Simply run the transpose procedure and provide only the by statement.

I've updated your sample data to convert the months to numeric values (rather than character which can't be transposed). I've also changed them to use valid base-sas names by removing the hyphen.

data have;
input  subject $ "Jan10"n  "Feb10"n  "Mar10"n ;
datalines;
1   12   18  22 
2   13   19  23
;
run;

Here's the transpose syntax you need, it will transpose all numeric variables by default:

proc transpose data=have out=want;
  by subject;
run;

You could also do something more explicit, but still dynamic such as:

proc transpose data=have out=want;
  by subject;
  var jan: feb: mar: ; * ETC;
run;

This would transpose all vars that begin with jan/feb/mar etc... Useful in case your table contains other numeric variables that you don't want to include in the transpose.