
I am using a Poisson GLM on some dummy data to predict ClaimCounts based on two variables, frequency and Judicial Orientation.

Dummy Data Frame:

data5 <-data.frame(Year=c("2006","2006","2006","2007","2007","2007","2008","2009","2010","2010","2009","2009"), 
           Loss = c(100000,100,2500,100000,25000,0,7500,5200, 900,100,0,50),

Model GLM:

ClaimModel <- glm(ClaimCount~JudicialOrientation+Frequency     
                           ,family = poisson(link="log"), offset=log(Exposure), data = data5, na.action=na.pass)

glm(formula = ClaimCount ~ JudicialOrientation + Frequency, family = poisson(link = "log"), 
    data = data5, na.action = na.pass, offset = log(Exposure))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.7555  -0.7277  -0.1196   2.6895   7.4768  

                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   -0.3493     0.2125  -1.644      0.1    
JudicialOrientationNeutral    -3.3343     0.5664  -5.887 3.94e-09 ***
JudicialOrientationPlaintiff  -3.4512     0.6337  -5.446 5.15e-08 ***
Frequency                     39.8765     6.7255   5.929 3.04e-09 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 149.72  on 11  degrees of freedom
Residual deviance: 111.59  on  8  degrees of freedom
AIC: 159.43

Number of Fisher Scoring iterations: 6

I am using an offset of Exposure as well.

I then want to use this GLM to predict claim counts for the same observations:

data5$ExpClaimCount <- predict(ClaimModel, newdata=data5, type="response")

If I understand correctly then the Poisson glm equation should then be:

ClaimCount = exp(-.3493 + -3.3343*JudicialOrientationNeutral + -3.4512*JudicialOrientationPlaintiff + 39.8765*Frequency + log(Exposure))

However I tried this manually(In excel =EXP(-0.3493+0+0+LOG(10)) for observation 1 for example) and for some of the observations but did not get the correct answer.

Is my understanding of the GLM equation incorrect?

You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.tkmckenzie
@tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.floe

1 Answers


You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:

co <- coef(ClaimModel)
p1 <- with(data5,
           exp(log(Exposure) +                            # offset
               co[1] +                                    # intercept
               ifelse(as.numeric(JudicialOrientation)>1,  # factor term
                      co[as.numeric(JudicialOrientation)], 0) +
               Frequency * co[4]))                        # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE

As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).