1
votes

I am new to this area of forecasting and use of ARIMA with external regressors. I am trying to forecast the customer demand for the next day based on historical data. I have 337 daily historicals. Here is some code:

modelFrequency=7
YearlyFrequency=84 #for multi seasonality
dataDuration = 1
futureHorizon = 1
testDataObs=67 #20% observations for testcase
minDateActuals =  "2017-02-27"
trainDateActuals = "2017-11-23"
maxDateActuals = "2018-01-29"
actualsAvg=rnorm(337,mean=100,sd=30) #for trial purpose.
actualsTS=zoo(actualsAvg$Demand,seq(from=as.Date(minDateActuals),to=as.Date(maxDateActuals),by = dataDuration),frequency = modelFrequency)
actualsTS = ts(actualsTS)
actualsMSTS = msts(actualsTS,seasonal.periods = c(modelFrequency,YearlyFrequency))

train = zoo(actualsTS,seq(from=as.Date(minDateActuals),to=as.Date(trainEndDate),by = dataDuration))
train = ts(train,frequency = modelFrequency)
trainMSTS = msts(train,seasonal.periods =  c(modelFrequency,YearlyFrequency))


aic_vals_temp=NULL #Trying to find a model with min. AIC
aic_vals=NULL
for (i in 1:3){
  for (j in 1:6){
     xreg1<-fourier(trainMSTS,K=c(i,j))
     fitma1<-auto.arima(trainMSTS,xreg=xreg1,seasonal = FALSE)
     aic_vals_temp<-cbind(i,j,fitma1$aic)
     aic_vals<-rbind(aic_vals,aic_vals_temp)
     print(aic_vals_temp)
    }
}

colnames(aic_vals) = c("Fourier7","Fourier84","AICValues")
aic_vals=data.frame(aic_vals)
minAICVal = min(aic_vals$AICValues)
minVals = aic_vals[which(aic_vals$AICValues == minAICVal),]
minVals=minVals[1,]
xregTrain=fourier(trainMSTS,K=c(minVals$Fourier7,minVals$Fourier84))
armaFourierTrain = auto.arima(trainMSTS,xreg=xregTrain,seasonal = FALSE)

xregTest=fourier(trainMSTS,K=c(minVals$Fourier7,minVals$Fourier84),h=testDataObs)
armaFourierTest =forecast(armaFourierTrain,xreg=xregTest, h=testDataObs)

 #works very well till this point on test data set. 
 #I will evaluate the errors on this model and compare it with another model's performance on test set and 
 #use one of the model to project next day demand. 
 #Now I want to use all available data to forecast next day.

actualsAvailable=zoo(actualsTS, seq(from = as.Date(minDateActuals), to = as.Date(maxDateActuals), by = dataDuration))
actualsAvailableTS=ts(actualsAvailable,frequency = modelFrequency)
actualsAvailableMSTS =msts(actualsAvailable,seasonal.periods = c(modelFrequency,YearlyFrequency))
aic_vals_temp=NULL
aic_vals=NULL
for (i in 1:3){
  for (j in 1:6){
    xreg1<-fourier(actualsAvailableMSTS,K=c(i,j))
    #xreg2<-fourier(trainMSTS,K=c(j))
    #xtrain<-cbind(xreg1,xreg2)
    fitma1<-auto.arima(actualsAvailableMSTS,xreg=xreg1,seasonal = FALSE)
    aic_vals_temp<-cbind(i,j,fitma1$aic)
    aic_vals<-rbind(aic_vals,aic_vals_temp)
    print(aic_vals_temp)
  }
}

colnames(aic_vals) = c("Fourier7","Fourier84","AICValues")
aic_vals=data.frame(aic_vals)
minAICVal = min(aic_vals$AICValues)
minVals = aic_vals[which(aic_vals$AICValues == minAICVal),]       
xregAllData = fourier(actualsAvailableMSTS,K=c(minVals$Fourier7,minVals$Fourier84),h = futureHorizon)    
   S1-7       C1-7     S1-84     C1-84
[1,] 0.9749279 -0.2225209 0.1490423 0.9888308
armaFourierAllData =auto.arima(actualsAvailableMSTS,xreg=data.frame(xregAllData),seasonal = FALSE)  #Problem is here.

Error in rowSums(xregg) : 'x' must be an array of at least two dimensions

It worked well for the test data forecast using similar approach. Dimensions of xregAllData = [1,1:4]. So what am I doing wrong. I would appreciate any help/insights/explanations. Pardon any typos in the above code.

1
xregAllData = fourier(actualsAvailableMSTS,K=c(minVals$Fourier7,minVals$Fourier84)) armaFourierAllData =auto.arima(actualsAvailableMSTS,xreg=xregAllData,seasonal = FALSE) xregAllData=fourier(trainMSTS,K=c(minVals$Fourier7,minVals$Fourier84),h=futureHorizon) armaFourierFC =forecast(armaFourierAllData,xreg=xregAllData, h=futureHorizon) I was missing out on some code. Here is the missing code. Thanks to folks who reviewed it. - user2162611

1 Answers

0
votes

forecast::auto.arima() calls stats::arima() which deletes all xreg matrix columns that are all zero. This action causes various errors at random places including:

  • 'x' must be an array of at least two dimensions
  • xreg is rank deficient

Possible solutions are

  • add more data
  • modify code to exclude passing xreg columns containing all zeros