Theory
Inspired from Lindén and Mäntyniemi's paper to use two overdispersion parameters to model the linear and! the quadratic mean-variance relationship. See the following equation. The case theta = 0 corresponds to a quasi-poisson, the case omega = 1 to the regular negative binomial.
sigma^2 = omega * mu + theta * mu^2
I mostly followed Ver Hoef and Boveng to implement it through iterative weighted least squares (IWLS).
Implementation IWLS
I can test the following code for the quasi-poisson and a negative binomial with omega fixed to one, but can not verify that it is well implemented for the negative binomial with omega unequal 1.
IWLS <- function(Omega1,Theta1,tollvl=1,X=X){
X <- as.matrix(X) # model.matrix(), factor variables only
mu <- rep(mean(Y),length(X[,1]))
Omega <- Omega1
Theta <- Theta1
Y.head<-Y # 152 datapoints
tol=10
i=1
while(tol>tollvl){ # Iterative
W<-diag(c(mu/(Omega[i] + Theta[i]*mu)) )
# Theta = 0 -> Quasipoisson, Omega = 1 -> negbin, both unfixed (reason wrote this code)
Beta.head<-solve(t(X)%*%W%*%X,t(X), tol=9^-45)%*%W%*%Y.head
nu <- X%*%Beta.head
mu <- exp(nu)
Y.head <- nu + (Y-mu)/mu # derivative of log(nu) -> 1/nu
if(Theta1==0){
Theta <- c(Theta,0)
Omega <- c(Omega,mean( (Y-mu)^2/mu) )
} else{
if(Omega1==1){
Theta<- c(Theta,mean( (( (Y-mu)^2) - mu )/mean(mu)^2))
Omega <- c(Omega,1)
} else{ # NB2
ThetaF <- mean( (Y-mu)^2 - mu )/(mean(mu)^2)
Omega <- c(Omega,mean( ((Y-mu)^2)/mu - mu*ThetaF) )
Theta <- c(Theta,mean( ((Y-mu)^2 - mu*Omega[i+1] )/(mean(mu)^2 )) )
}
}
tol <- abs(Omega[i+1]-Omega[i])+abs(Theta[i+1]-Theta[i])
i <- i+1
}
list(Omega=Omega,
Theta=Theta,
Beta.head=Beta.head,
SE = sqrt(diag(solve(t(X)%*%W%*%X))),
FittedValues = mu)
}
Questions / Problems
The main concern is about the following line, specifiing the double overdispersed negative binomial. I found no literature to back up that code. I tried a for() loop here and different approaches regarding the mean, since taking the mean for every mu or taking the mean at the end highly influences the result.
else{ # NB2
ThetaF <- mean( (Y-mu)^2 - mu )/(mean(mu)^2)
Omega <- c(Omega,mean( ((Y-mu)^2)/mu - mu*ThetaF) )
Theta <- c(Theta,mean( ((Y-mu)^2 - mu*Omega[i+1] )/(mean(mu)^2 )) )
}
This version of the code works well (Beta.head same, only different SE compared to other two models) for a relatively small amount of covariates, where the model matrix has dimension of around 152:5. For more columns / multiple factor variables the algorithm fails to compute a standard error for every attribute with the {In sqrt(diag(solve(t(X) %% W %% X))) : NaNs produced}, indicating multilinearity. Theta and omega sometimes don't converge, but even if they do this message appears for more than one variable.
Testcode to run:
set.seed(123)
Y.Test<-sample(1:100,152,replace = T)*rpois(152,1)+1 # Try to model true data
XXX<-model.matrix(Y.Test~as.factor(sample(1:4,152,replace = T))+as.factor(sample(1:3,152,replace = T)))
Neg.Bin2.Test<-IWLS(Omega1 =var(Y)/mean(Y),Theta1 = var(Y)/mean(Y)^2 - 1/mean(Y), tollvl = 10^-10,X=XXX)
If multilinearity is the concern, why is it then only problematic for the double overdispersed model. A omega=1 fix solves the problem.