I have a panel data that consists of 5908 individual observations in 5 years. I want to estimate the maximum likelihood estimator using optim() package. Here's my code
library(pglm)
library(plm)
data("HealthIns")
dat<- pdata.frame(HealthIns,index = c("id","year"))
summary(dat)
dat$claims<-ifelse(dat$size>=4,1,0)
y<-data.matrix(dat$claims)
y[is.na(y)]=0
Y<-matrix(data=y,nrow=5908,ncol=5)
dat$ageclass<-ifelse(dat$age >=30,1,0)
x1<-data.matrix(dat$ageclass)
x1[is.na(x1)]=0
X1<-matrix(data=x1,nrow=5908,ncol=5)
dat$gender <-ifelse(dat$sex=="male",1,0)
x2<-data.matrix(dat$gender)
x2[is.na(x2)]=0
X2<-matrix(data=x2,nrow=5908,ncol=5)
dat$child<-ifelse(dat$child=="yes",1,0)
x3<-data.matrix(dat$child)
x3[is.na(x3)]=0
X3<-matrix(data=x3,nrow=5908,ncol=5)
The -log likelihood that I want to use is
po.gam=function(para){
#Lambda(i,t)
{for (i in (1:5908)){
for(t in (1:5)){
lambda<-as.matrix(exp(para[1] + para[2]*X1 + para[3]*X2 + para[4]*X3),nrow=5908,ncol=5)}}
}
num.claims.of.t <-numeric(nrow(Y))
{for (i in seq(nrow(Y))){
num.claims.of.t[i] <-sum(Y[i,])}
}
num.lambda.of.t<-numeric(nrow(Y))
{for (i in seq(nrow(Y))){
num.lambda.of.t[i]<-sum(lambda[i,])}
}
prod.exp<-numeric(nrow(Y))
{for (i in seq(nrow(Y))){
prod.exp[i]<-prod(lambda[i,]^Y[i,]/factorial(Y[i,]))}
}
joint.pdf.mvnb<-(prod.exp)*(gamma(num.claims.of.t + (1/para[5]))/gamma(1/para[5]))*(((1/para[5])/(num.lambda.of.t + (1/para[5])))^(1/para[5]))*((num.lambda.of.t + (1/para[5]))^(-num.claims.of.t))
#PRODUC NUMBER OF CLAIMS SEMUA INDIVIDU
prod.mvnb=1
for (i in (length(joint.pdf.mvnb))){
prod.mvnb<-prod(joint.pdf.mvnb[i])
}
return(-log(prod.mvnb))
}
Then using optim()
start.value <- c(beta0=0.01,beta1=0.01,beta2=0.01,beta3=0.01,alfa=0.01)
MLE_pogam<-optim(start.value,fn=po.gam,hessian=FALSE)
MLE_pogam
And the program will run more than 2 hours still without any output. Do you have any suggestion for optimize the log likelihood function for the big data? Thank you!!
The default method is an implementation of that of Nelder and Mead (1965), that uses only function values and is robust but relatively slow.. - user2974951iandt, so the result will be exactly the same each time, so the for loops could be removed - Miff