0
votes

Let's supoose, I have three Independent categorical variables e, f and g and would like to estimate the dependent variable y. After some work, I come with the following regression model:

y =  b0 + b1*x + b2*y + b3*z + b4(xy) + b5(xz)

How can I determine whether there is an overall significant difference for the different categories/levels of x? Since the terms with b2 and b3 are equal, I think they can probably be neglected.

1

1 Answers

0
votes

From what I understand, the three categorical variables are x, y, and z in the regression model. I am going to rewrite y as w because the outcome and variables are both labeled y. For this post, I am referring to this model:

Y = b0 + b1x + b2w + b3z + b4(xw) + b5(xz)

You have an interaction between the three levels (xw, xz, and wz as reference).

If that is true, then you cannot make claims about the direct effect of x on w. Why? Because the columns are collinear making estimates for b1, b2, and b3 biased. Another way to think of it is that the effect of x depends on w (hence the interaction).

If you want to understand the direct effect of x on Y (or direct effect of W on Y, z on Y), then you can fit a model without the interaction terms. IE fit

Y = b0 + b1x + b2w + b3z

and look at the significance on b1. This model says that the effect of x on Y is independent of w or z.

Because you mention b2 and b3 are approximately the same, I suggest another approach. You can collapse the w and z variables together (if it makes sense scientifically) and fit a model with just an interaction term between x and the merged wz variable.

Suppose you need the interaction terms and want to communicate how x effects y. Then you can choose meaningful values of the covariates and explain how that changes the outcome. This strategy defocuses on 'significance' and shifts the attention to interpretation and meaning. For example, effect on Y is b1 if x=1, w=0, and z=0; effect on Y is b1 + b3 + b5 if x=1, w=0, z=1, etc.