3
votes

We are dealing with a regression model that contains two categorical variables age groups and gender.

We want to include an interaction term between the two categorical variables but the resulting model only displays the effects of the interactions between females with all age groups.

How can we adjust the code so that it keeps "males" aged "26-30" as a reference level and shows the effect of all other groups in its output?

Adjustment code

count_med_op3 <- glm(Count_OP ~ Gender * age_group + otherfactors,
                     data = med, family = 'poisson')

Result wanted for:

GenderMale:age_group"0-1" 
GenderMale:age_group"2-6"
GenderMale:age_group"7-18"
GenderMale:age_group"19-25"
GenderMale:age_group"31-36"
Genderfemale:age_group"0-1"
Genderfemale:age_group"2-6"
Genderfemale:age_group"7-18"
Genderfemale:age_group"19-25"
Genderfemale:age_group"26-30"
other factors
1

1 Answers

3
votes

Use relevel:

# simulate some data
df_foo = data_frame(
  age = as.factor(sample(seq(10, 90, 10), 100, replace = TRUE)),
  y = rnorm(100),
  gender = as.factor(sample(c("Male", "Female"), 100, replace = TRUE))
)

# female as omitted level
df_foo %>% 
  lm(y ~ age*gender, data = .) %>% 
  summary()

# male as omitted level
df_foo %>% 
  lm(y ~ age*relevel(gender, ref = "Male"), data = .) %>% 
  summary()