Different ways of specifying the same AR (or MA) model to be estimated by function arima() in package forecast in R yield different BIC (Bayesian information criterion) values.
Why does this happen?
Consider two models:
(1) AR(1)
(2) AR(2) with coefficient on AR2 restricted to zero
On paper, the two model are the same. However, their estimation might differ (?). Not sure why they produce equal coefficient estimates, equal log-likelihood values and equal AIC values -- but different BIC values.
Since BIC values differ while likelihoods are equal and AIC values are equal, the number of observations used in estimation must be different between the two models. However, the implied difference in the number of observations is not 1 or 2 but much more.
Is this justified, or is it a bug ???
I wonder what the difference is and how BIC is calculated in case (2). I would like to be able to reproduce the results, so I need to understand how things work here.
Below I provide a reproducible example. Having executed it in R, look at the printed BIC, and also AICc, values -- they are different between the models.
library(forecast)
T=1000; seed=1; set.seed(seed); x=rnorm(T)
model1=arima(x,order=c(1,0,0) ,method="CSS-ML",transform.pars=FALSE)
model2=arima(x,order=c(2,0,0),fixed=c(NA,0,NA),method="CSS-ML",transform.pars=FALSE)
print(model1)
print(model2)
The same applies to AR(p) and MA(q) models, which I do not discuss explicitly to keep it simple.
Would be great if someone could explain why this happens. Thanks!