0
votes

i am not getting the same results by following Hilbe's, J. 2011 procedure (herein referred to as "the book") on page 20. The procedure is for computing Poisson regression with robust standard errors using the titanic data set in glm R. Hilbe's source code is in Table 2.4 according to the following link: Negative Binomial Regression Second edition Errata 2012

I believe Some changes to the titanic dataset has occurred since it was published here is the procedure and Stata results from the book and what was performed, which resulted in incorrect or different results in R as of july 2021:

library(COUNT)
data("titanic")
attach(titanic)
library(gmodels)

str(titanic)
'data.frame':   1316 obs. of  4 variables:
 $ class   : Factor w/ 3 levels "3rd class","1st class",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ age     : Factor w/ 2 levels "child","adults": 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "label")= chr "0=child; 1=adult"
  ..- attr(*, "format")= chr "%10.0g"
  ..- attr(*, "value.label.table")= Named int  0 1
  .. ..- attr(*, "names")= chr  "child" "adults"
 $ sex     : Factor w/ 2 levels "women","man": 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "label")= chr "gender: 0=female; 1=male"
  ..- attr(*, "format")= chr "%8.0g"
  ..- attr(*, "value.label.table")= Named int  0 1
  .. ..- attr(*, "names")= chr  "women" "man"
 $ survived: num  2 2 2 2 2 2 2 2 2 2 ...
 - attr(*, "stata.info")=List of 5
  ..$ datalabel  : chr "Hilbe, Modeling Count Data (CUP, 2014)"
  ..$ version    : int 12
  ..$ time.stamp : chr "14 Jul 2014 15:12"
  ..$ val.labels : chr  "class" "age" "sex" "survived"
  ..$ label.table:List of 4
  .. ..$ class   : Named int  1 2 3 4
  .. .. ..- attr(*, "names")= chr  "1st class" "2nd class" "3rd class" "crew"
  .. ..$ age     : Named int  0 1
  .. .. ..- attr(*, "names")= chr  "child" "adults"
  .. ..$ sex     : Named int  0 1
  .. .. ..- attr(*, "names")= chr  "women" "man"
  .. ..$ survived: Named int  0 1
  .. .. ..- attr(*, "names")= chr  "no" "yes"

the book relevels the class.

titanic$class <- relevel(factor(titanic$class), ref=3)

however, as of 2021 "survive" has become a factor opposed to what i believe used to be a binary 0="no" and 1="yes" integers, therefore, the survive is recoded accordingly

titanic$survived <- as.character(titanic$survived)
titanic$survived [which(titanic$survived =="no")] <- "0"
titanic$survived [which(titanic$survived =="yes")] <- "1"
titanic$survived <- as.integer(titanic$survived)

Code from the 2012 errata:

tit3 <- glm(survived ~ factor(class), family=poisson, data=titanic)
irr <- exp(coef(tit3)) # vector of IRRs
library("sandwich")
rse <- sqrt(diag(vcovHC(tit3, type="HC0"))) # coef robust SEs
irr*rse # IRR robust SEs

irr*rse output in R console

 (Intercept) factor(class)1st class factor(class)2nd class 
            0.01634255             0.19270871             0.15723303

using summary function

> summary(tit3)

Call:
glm(formula = survived ~ factor(class), family = poisson, data = titanic)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.1177  -0.7101  -0.7101   0.4364   1.1225  

Coefficients:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)            -1.37783    0.07495 -18.384  < 2e-16 ***
factor(class)1st class  0.90721    0.10268   8.835  < 2e-16 ***
factor(class)2nd class  0.49603    0.11871   4.179 2.93e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 967.81  on 1315  degrees of freedom
Residual deviance: 889.69  on 1313  degrees of freedom
AIC: 1893.7

Number of Fisher Scoring iterations: 5

The following is believed to be the correct estimates because its what is n the book. The Incident Rate Ratio (IRR)is :

class2: 1.642184 
class1: 2.477407

and estimates and Robust Std. Err.

class2: 0.1572928
class1: 0.192782 

all have P>|z| == 0. Can someone confirm this? Thank you

1

1 Answers

0
votes

Confirmed!

data('titanic', package="COUNT")
titanic <- transform(titanic, survived=as.numeric(survived) - 1,
                     class=relevel(class, ref=3))

tit3 <- glm(survived ~ class, family=poisson, data=titanic)

library(sandwich);library(lmtest)
(smy <- coeftest(tit3, vcovHC(tit3, type="HC0")))
# z test of coefficients:
#   
#                 Estimate Std. Error  z value  Pr(>|z|)    
# (Intercept)    -1.377832   0.064819 -21.2565 < 2.2e-16 ***
# class1st class  0.907212   0.077786  11.6629 < 2.2e-16 ***
# class2nd class  0.496027   0.095746   5.1806 2.211e-07 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(irr <- exp(coef(tit3)))
# (Intercept) class1st class class2nd class 
#   0.2521246      2.4774071      1.6421841 

rse <- sqrt(diag(vcovHC(tit3, type="HC0")))
irr*rse
# (Intercept) class1st class class2nd class 
#  0.01634255     0.19270871     0.15723303