Why are log binomial regression results different in R and SAS?

Question

I am an R newbie and I have been playing around with datasets to learn R. Most of my experience has been with SAS. So, in attempting to conduct a log binomial regression on a dichotomous outcome and exposure variable, I immediately noticed the result produced by R did not correspond with what I got doing a contingency analysis, that is, producing a crude relative risk estimate, AND from SAS results.

The dataset has 400 observations. The outcome is acceptance to college (1=Yes, 0=No) and the independent variable is high school class rank (1=high, 0=low).

I created a 2x2 table :

      Admission     Row Total
Rank   1      0
   1  87    125     212
   0  40    148     188

Here one can see that high rank increases the probability of being admitted to college by a factor of 1.9 [(87/212)/(40/188)]. The crude estimate would produce a beta coefficient of approximately 0.65 (ln 1.9). Yet when I ran a log binomial regression in R, the beta coefficient it yielded was 0.289.

Here's my code:

glm(formula = admit ~ rank, family = binomial(link = log), data = my data)

I know that in R I have to convert numerical variables into "factors" and order them. The reference group for both variables is 0.

In SAS the code I used was:

proc genmod data=temp; model admit=rank/link=log dist=binomial;
estimate 'Prob of admission by rank' rank 1/exp;
run;

The beta for rank is 0.657 (RR=1.93). Am I missing something? I know this seems like a basic question, but I cannot find my mistake.

thelatemail thelatemail · Accepted Answer · 2014-08-27T05:33:14

Making your referent group 1 instead of 0 seems to fix it

# change the reference level:
x$rank <-  relevel(factor(x$rank),"1")
x$admit <- relevel(factor(x$admit),"1")

fit <- glm(admit ~ rank, data=x, family=binomial(link="log"))
coef(fit)
#(Intercept)       rank0 
# -1.5475625   0.6568844 
exp(coef(fit))
#(Intercept)       rank0 
#   0.212766    1.928774

Whether this is a 'good thing' to do or not is somewhat questionable - read more here:

http://r.789695.n4.nabble.com/Relative-Risk-in-logistic-regression-td4657040.html

Why are log binomial regression results different in R and SAS?

2 Answers