I have a dataset with n levels of id and (n+4) variables. I wish to perform a regression for each of the n levels of the categorical variable, using the values of the n-1 variables as explanatory variables. Here is my dataset:
data have;
input s id $ x z y00 y01 y02;
cards;
1 00 95 5.00 .02 .43 .33
2 00 100 5.50 .01 .44 .75
3 00 110 5.25 .10 .37 .34
4 00 97 5.00 .02 .43 .33
5 00 100 5.50 .01 .43 .75
6 00 120 5.25 .10 .38 .47
7 00 95 5.00 .02 .43 .35
8 00 130 5.50 .01 .44 .75
9 00 110 5.25 .10 .39 .44
10 00 85 5.00 .02 .43 .33
11 00 110 5.50 .01 .47 .78
12 00 110 5.25 .10 .37 .44
1 01 20 6.00 .22 .01 .66
2 01 25 5.95 .43 .10 .20
3 01 70 4.50 .88 .05 .17
1 02 80 2.50 .65 .33 .03
2 02 85 3.25 .55 .47 .04
3 02 90 2.75 .77 .55 .01
;
run;
So I wish to use z, y01, and y02 to explain x for ID 00. Similarly, z, y00, and y02 will explain x for ID 01. Lastly, z, y00, and y01 will explain x for ID 02.
I can just use a 'BY' statement, but I can't think of how to tell the model to ignore the variable with the same prefix as the ID I am currently working with.
I could create separate datasets, but n>100 for some of these analyses.
Ideally, I would run a proc mixed and a proc reg for every ID as described above, and have a dataset with the parameters for each.
Any ideas?
proc mixed data=have(where=(id='00')) plots(only)=all method=REML nobound ;
class s;
model x=z y01 y02
/ solution;
random z y01 y02;
run;
proc reg data=have(where=(id='00'));
model x=z y01 y02;
run;
Thanks.