1
votes

I have daily data(dummy data as follows):

 Date   Value
01/01/2014  610413
02/01/2014  243374
03/01/2014  459427
04/01/2014  243769
05/01/2014  415550
06/01/2014  345504
07/01/2014  583661
08/01/2014  406861
09/01/2014  326838
10/01/2014  389894

The data runs until 2016 & I want to run an arima model & when I check for daily seasonality:

#Check for daily seasonality
ets(Data2)
fit <- tbats(Data2)
seasonal <- !is.null(fit$seasonal)
seasonal

& the result is seasonal [1] TRUE

#Check for weekly seasonality
timeSeriesObj = ts(Data2,start=c(2014,1,1),frequency=7)
 fit <- tbats(timeSeriesObj)
seasonal <- !is.null(fit$seasonal)
seasonal

& the result is seasonal [1] TRUE I need to generate forecasts for the next 3 years & given the seasonality I wish to use Fourier terms. But I am not too conversant with how to generate Fourier terms. I went through the paper https://robjhyndman.com/hyndsight/forecasting-weekly-data/ But how to optimally select the number of Fourier terms. Per the paper I ran the following code :

   bestfit <- list(aicc=Inf)
  for(i in 1:25)
  {
  fit <- auto.arima(data_ts, xreg=fourier(data_ts, K=i), seasonal=FALSE)
  if(fit$aicc < bestfit$aicc)
   bestfit <- fit
   else break;
   }
  bestfit
  fc <- forecast(bestfit, xreg=fourier(data_ts, K=7, h=104))
  plot(fc)

But the forecast piece throws error:

Error in forecast.Arima(bestfit, xreg = fourier(data_ts, K = 7, h = 104)) : Number of regressors does not match fitted model This is because I am unable to identify the optimal number of 'K'.Moreover, is there a better alternative to deal with the seasonality that I have in the data.

Thanks in advance.

2

2 Answers

1
votes
fc <- forecast(bestfit, xreg=fourier(data_ts, K=7, h=104))

I believe that K should be your bestfit's K (in this case, i) that minimizes AICc, not 7.

I hope it helps.

0
votes

This could help, not sure if you still need the answer:

bestfit <- list(aicc=Inf)
  for(i in 1:25)
  {
  fit <- auto.arima(data_ts, xreg=fourier(data_ts, K=i), seasonal=FALSE)
  if(fit$aicc < bestfit$aicc)
   bestfit <- fit
   else break;
   print(i)
   }

If you include the print(i) then the last number that comes up is your optimal K value.