I am an R newbie and I have been playing around with datasets to learn R. Most of my experience has been with SAS. So, in attempting to conduct a log binomial regression on a dichotomous outcome and exposure variable, I immediately noticed the result produced by R did not correspond with what I got doing a contingency analysis, that is, producing a crude relative risk estimate, AND from SAS results.
The dataset has 400 observations. The outcome is acceptance to college (1=Yes, 0=No) and the independent variable is high school class rank (1=high, 0=low).
I created a 2x2 table :
Admission Row Total
Rank 1 0
1 87 125 212
0 40 148 188
Here one can see that high rank increases the probability of being admitted to college by a factor of 1.9 [(87/212)/(40/188)]. The crude estimate would produce a beta coefficient of approximately 0.65 (ln 1.9). Yet when I ran a log binomial regression in R, the beta coefficient it yielded was 0.289.
Here's my code:
glm(formula = admit ~ rank, family = binomial(link = log), data = my data)
I know that in R I have to convert numerical variables into "factors" and order them. The reference group for both variables is 0.
In SAS the code I used was:
proc genmod data=temp; model admit=rank/link=log dist=binomial;
estimate 'Prob of admission by rank' rank 1/exp;
run;
The beta for rank is 0.657 (RR=1.93). Am I missing something? I know this seems like a basic question, but I cannot find my mistake.