Manual Maximum-Likelihood Estimation of an AR-Model in R

Question

I am trying to estimate a simple AR(1) model in R of the form y[t] = alpha + beta * y[t-1] + u[t] with u[t] being normally distributed with mean zero and standard deviation sigma.

I have simulated an AR(1) model with alpha = 10 and beta = 0.1:

library(stats)
data<-arima.sim(n=1000,list(ar=0.1),mean=10)

First check: OLS yields the following results:

lm(data~c(NA,data[1:length(data)-1]))

Call:
lm(formula = data ~ c(NA, data[1:length(data) - 1]))

Coefficients:
                (Intercept)  c(NA, data[1:length(data) - 1])  
                   10.02253                          0.09669

But my goal is to estimate the coefficients with ML. My negative log-likelihood function is:

logl<-function(sigma,alpha,beta){
-sum(log((1/(sqrt(2*pi)*sigma)) * exp(-((data-alpha-beta*c(NA,data[1:length(data)-1]))^2)/(2*sigma^2))))
}

that is, the sum of all log-single observation normal distributions, that are transformed by u[t] = y[t] - alpha - beta*y[t-1]. The lag has been created (just like in the OLS estimation above) by c(NA,data[1:length(data)-1]).

When I try to put it at work I get the following error:

library(stats4)
mle(logl,start=list(sigma=1,alpha=5,beta=0.05),method="L-BFGS-B")
Error in optim(start, f, method = method, hessian = TRUE, ...) : 
L-BFGS-B needs finite values of 'fn'

My log-likelihood function must be correct, when I try to estimate a linear model of the form y[t] = alpha + beta * x[t] + u[t] it works perfectly.

I just do not see how my initial values lead to a non-finite result? Trying any other initial values does not solve the problem.

Any help is highly appreciated!

I find it annoying when seeing code that obviously uses functions from non-standard packages is posted without including the library calls that would identify and load the function. (I also don't see recognition in your code of the fact that lagged variables have NA's.) — IRTFM
@DWin , I think it's just stats4::mle, so library(stats4) should do it. At least this is a base package ... — Ben Bolker
When I did ?mle I got a "No documentation for ‘mle’" message. So maybe it's like grid and not routinely loaded despite being recommended. — IRTFM
Yes, you do need library(stats4). I haven't looked carefully, but I strongly suspect that the primary issue is the NA values in the lagged variables -- this would make the ML NA. Rule #1: always check by evaluating your log-likelihood function at any proposed initial values to make sure the results make sense! — Ben Bolker
Sorry that I did not include all the "library" commands and the data. Now everything is there :) I also do believe that the error must come from the fact that there is a NA value in the data, even though to me it is strange to receive "this" message. Does anyone please know how I overcome the problem with the first value being NA? How do I tell R to start with the second observation? Thank you. — Chris437

Ben Bolker Ben Bolker · Accepted Answer · 2013-09-08T21:29:24

This works for me -- basically what you've done but leaving out the first element of the response, since we can't predict it with an AR model anyway.

Simulate:

library(stats)
set.seed(101)
data <- arima.sim(n=1000,list(ar=0.1),mean=10)

Negative log-likelihood:

logl <- function(sigma,alpha,beta) {
   -sum(dnorm(data[-1],alpha+beta*data[1:length(data)-1],sigma,log=TRUE))
}

Fit:

library(stats4)
mle(logl,start=list(sigma=1,alpha=5,beta=0.05),method="L-BFGS-B")
## Call:
## mle(minuslogl = logl, start = list(sigma = 1, alpha = 5, beta = 0.05), 
##     method = "L-BFGS-B")
## 
## Coefficients:
##  0.96150573 10.02658632  0.09437847

Alternatively:

df <- data.frame(y=data[-1],ylag1=head(data,-1))
library(bbmle)
mle2(y~dnorm(alpha+beta*ylag1,sigma),
     start=list(sigma=1,alpha=5,beta=0.05),
     data=df,method="L-BFGS-B")

Manual Maximum-Likelihood Estimation of an AR-Model in R

1 Answers