I am using a Kalman filter to estimate various Dynamic and Arbitrage free Nelson-Siegel models for yield curves. I give some starting values to optim and the algorithm converges just fine. However, when I want to calculate standard errors using the Hessian supplied by the optim algorithm, I get NaN's due to nonpositive values on the diagonal of the Variance covariance matrix. I think it is because I have a highly nonlinear function with many local optima, however it keeps happening for all starting values I try.
The function I use is optim
together with the default Nelder-Mead algorithm.
The command I use is
opt_para<-optim(par=par0, fn=Kalman_filter, y=y,
maturities=maturities,control=list(maxit=20000),hessian=TRUE)
The starting values are given in par0
, which is
> par0
[1] 9.736930e-01 1.046646e+00 5.936238e-01 4.444669e-02 2.889251e-07 6.646960e+00 7.715964e-01 9.945551e-01 9.663361e-01
[10] 6.000000e-01 6.000000e-01 6.000000e-01 6.000000e-02 5.000000e-01 5.000000e-01 5.000000e-01 5.000000e-01
The optim
output that I get is
$
par[1] 0.833208307 1.373442068 0.749313983 0.646577154 0.237102069 6.882644818 0.788775982 0.918378263 0.991982038
[10] 0.748509055 0.005115171 0.392213941 0.717186499 0.121525623 0.386227284
0.001970431 0.845279611
$value
[1] 575.7886
$counts
function gradient
5225 NA
$convergence
[1] 0
$message
NULL
I then use the following command to produce the standard errors of the estimates.
hessian<-opt_para$hessian
fish_info<-solve(hessian,tol=1e-100)
st_errors<- diag(sqrt(fish_info))
st_errors
I get the following output
st_errors
[1] NaN NaN 2.9170315888 NaN NaN NaN 0.0294300357 0.0373614751 NaN
[10] 0.0785349634 0.0005656580 NaN 0.0470600219 0.0053255251 0.0408666177 0.0001561243 0.4540428740
The NaNs are being produced to a negative value on the diagonal, which should be impossible in a variance-covariance matrix. However, I suspect that it is due to the optimization procedure being not correct.
To be clear, I also include the function I want to optimize. It is a Kalman-filter with updating equations and some restrictions built in.
Kalman_filter<-function(par, y, maturities){
b0<-c(par[1],par[2],par[3])
P0<-diag(c(par[4],par[5],par[6]))
Phi<-diag(c(par[7],par[8],par[9]))
mu<-c(par[10],par[11],par[12])
lambda<-par[13]
sigma11<-par[14]
sigma21<-par[15]
sigma22<-par[16]
sigma33<-par[17]
m=length(b0)
n=length(y[,1])
d<-length(y[1,])
sigma_eps<-sigma11*diag(d)
sigma_nu<-diag(c(sigma21^2,sigma22^2,sigma33^2))*(1/12)
colnames(sigma_nu)<-c("level","slope","Curvat")
X<-matrix(cbind(rep(1,length(maturities)), slope_factor(lambda,maturities), curv_factor(lambda,maturities)),ncol=3)
colnames(X)<-c("level","slope","Curvature")
bt<-matrix(NA, nrow=m, ncol=n+1)
Pt<-array(NA, dim=c(m,m,n+1))
btt<-matrix(NA, nrow=m,ncol=n+1)
Ptt<-array(NA, dim=c(m,m,n+1))
vt<-matrix(NA, nrow=d, ncol=n)
eigen_values<-eigen(Phi,only.values=TRUE)$values
if(eigen_values[1]>=1||eigen_values[2]>=1||eigen_values[3]>=1){
loglike=-70000000
}else{
c<- (diag(3) - Phi)%*% mu
loglike<-0
i<-1
btt[,1]<-b0
Ptt[,,1]<-P0
while(i< n+1){
bt[,i]<- c+ Phi%*% btt[,i]
Pt[,,i] <- Phi%*% tcrossprod(Ptt[,,i],Phi) + sigma_nu
vt[,i]<- y[i,] - X%*% bt[,i]
ft<-X%*% tcrossprod(Pt[,,i], X) + sigma_eps
det_f<-det(ft)
if( is.nan(det_f) || is.na(det_f)|| is.infinite(det_f)){
loglike<- - 700000000
} else
{
if(det_f<0){
loglike <- - 700000000
} else
{
if (abs(det_f>1e-20)){
logdet_f<- log(det_f)
f_inv<- solve(ft, tol=1e-200)
Kt<- tcrossprod(Pt[,,i],X)%*% f_inv
btt[,i+1] <- bt[,i] + Kt%*% vt[,i]
Ptt[,,i+1] <- (diag(3) - Kt%*% X)%*% Pt[,,i]
loglike_contr<- -0.5*d*log(2*pi) - 0.5 * logdet_f - 0.5*
crossprod(vt[,i],f_inv)%*% vt[,i]
loglike<-loglike+loglike_contr
} else
{ loglike<- -700000}
}
}
i<-i+1
}
}
return(-loglike)
}
Any help would be appreciated.