8
votes

I am trying to do out of sample forecasting using python statsmodels. I do not want to just forecast the next x number of values from the end of the training set but I want to forecast one value at a time and take in consideration the actual values when forecasting. In other words I want to do rolling 1-period forecasts, but I don't want to recalibrate the model every time. The closest post I could find was here:

ARMA out-of-sample prediction with statsmodels

However, this uses ARMA not ARIMA. How can I achieve this with ARIMA or is there a better method? I know I could actually pull the coefficients and apply a function myself but in my code the ARIMA model I am using is dynamic over time, therefore the number of coefficients and lagged values used is not constant. Any help would be greatly appreciated.

2

2 Answers

9
votes

If I am right, I had the very similar problem: basically I wanted to split my time series into training and test set, train the model, and then predict arbitrarily any element of the test set given its past history. I did not manage to achieve it using the ARIMA statsmodels class though.

That's how I did it using statsmodels: I've applied a first order difference to the series to achieve stationarity, and computed an arma model:

model = sm.tsa.ARMA(fitting_data, order=(p, q), dates=fitting_dates).fit()

I've converted the arma model into a pure-ar one:

ar_params = model.arparams
ma_params = model.maparams

ar_coefficients = arma2ar(ar_params, ma_params, nobs=final_ar_coeff)

The nobs parameters influences the number of auto-regressive coefficients you will get. I tried several values, increasing it until no significant change in the predictions was observed. Once you get your predictions w.r.t. the differenced series, you want to bring back them to the original one. I implemented a method which, given one or a chain of predictions and the last known element before your forecasts, computes the predictions in the original series:

def differenced_series_to_original(values, starting_value):

    original_series = [starting_value]
    [original_series.append(original_series[-1]+i) for i in values]

    return original_series[1:]

Obviously values is the list of your predictions, starting_value the last known element. Hope it helps with your problem.

0
votes

From what i can understand is that you dont want to run the model everytime, there can be two solutions to this problem

  1. Extract model in pickle format and then use the same model everytime to create forecasts.
  2. Extract the coefficients from the model and use it for your calculations.

Code for both the options are below.

  1. Pickle creation and using it further.

    import pmdarima as pm
    model = pm.auto_arima(train,
                          exogenous=exogenous_train,
                          start_p=1, start_q=1,
                          test='adf',       # use adftest to find optimal 'd'
                          max_p=5, max_q=5, # maximum p and q
                          m=12,              # frequency of series
                          d=None,           # let model determine 'd'
                          seasonal=True,   # No Seasonality
                          start_P=0, 
                          D=1, 
                          trace=True,
                          error_action='ignore',  
                          suppress_warnings=True, 
                          stepwise=True)
    
    filename = 'ARIMA_Model.sav'
    pickle.dump(model, open(filename, 'wb')) ## This will create a pickle file
    
    ## Load Model
    model = pickle.load(open(filename, 'rb'))
    
    ## Forecast
    fc, confint = model.predict(n_periods=1, 
                        exogenous=exogenous_test_df,
                        return_conf_int=True)
    
  2. Extract the model coefficients, I have used pmdarima for ARIMA so this is how you can extract the coefficients. I guess it should be same in other ARIMA libraries.

    Model_dict = model.to_dict()
    Model_Order = Model_dict['order']
    Model_seasonal_order = Model_dict['seasonal_order'][1]