1
votes

I am trying to compare the coefficients of two linear regressions with the same variables, but run for different subgroups. I want to check if the coefficients in my model 1 are equal to my coefficients in my model 2. I need to know for each coefficient.

My reproducible data :

Data <- data.frame(
gender = sample (c("men", "women"), 2000, replace = TRUE),
var1 = sample (c("value1", "value2"), 2000, replace = TRUE),
var2 = sample (c("valueA", "valueB"), 2000, replace = TRUE),
y = sample(0:10, 2000, replace = TRUE)
)

I run the two regressions :

men <- subset(Data, gender =="men")
women <- subset(Data, gender =="women")

lm.men <- lm(y~var1+var2, data = men)
summary(lm.men)
lm.women <- lm(y~var1+var2, data = women)
summary(lm.women)

Basically, I want to test if:

  • coefficient var1 in lm.men = coefficient var1 in lm.women
  • coefficient var2 in lm.men = coefficient var2 in lm.women

I can't use the anova() function, because my two samples are different. I think I should apply an F-test but I can't find a function for this test.

Does anyone know how to solve my problem ?

1
This is more of a "Cross-Validated" question, and one that already has an answer - Barker
oops. at least my answer gives R-coding details that aren't included in the CV answers. - Ben Bolker

1 Answers

2
votes

As @Barker points out in comments, the statistical part of this question is already answered on CrossValidated; I'll add some R-coding details here.

In order to answer these questions ("do the effects of var1 and var2 differ significantly between men and women?), fit a model with variable-by-gender interactions and test the interaction terms.

Data <- data.frame(
     gender = sample (c("men", "women"), 2000, replace = TRUE),
     var1 = sample (c("value1", "value2"), 2000, replace = TRUE),
    var2 = sample (c("valueA", "valueB"), 2000, replace = TRUE),
    y = sample(0:10, 2000, replace = TRUE)
 )
 mm <- lm(y~(var1+var2)*gender,Data)

Here are the interaction terms:

interax <- c("var1value2:genderwomen","var2valueB:genderwomen")
printCoefmat(coef(summary(mm))[interax,])
##                        Estimate Std. Error t value Pr(>|t|)
## var1value2:genderwomen  0.20144    0.28241  0.7133   0.4758
## var2valueB:genderwomen -0.15423    0.28266 -0.5456   0.5854