Cross validation of monthly time series using fable package

Question

I have a monthly time series data and I want to model it using different models in the Fable package by using cross validation to know the best model among the models considered.

# My data
google <-  read_csv("google.csv") %>% 
  tsibble(index = date)

# dimension of the data is 60 by 2.

Sample data ]1

# Training data for cross validation

google_tr <- google %>%
  slice(1:(n()-1)) %>%
  stretch_tsibble(.init = 3, .step = 1)

# Building models for the data
fc <- google_tr %>% 
  model(ets = ETS(closing_price),
        arima =   ARIMA(closing_price),
        rw = RW(closing_price ~ drift()),
        prophet = prophet(closing_price)) %>% 
  forecast(h = "1 year")

A lot of warnings appeared!

Model evaluation

fc %>% accuracy(google)

I have read https://otexts.com/fpp3/tscv.html and https://otexts.com/fpp3/arima-ets.html#example-comparing-arima-and-ets-on-non-seasonal-data time without number and I still don't know how to select the right training data. If I can get the right input for slice() and stretch_tsibble() for monthly data in the chunk below, the problem would be solved.

google_tr <- google %>%
  slice(1:(n()-1)) %>%
  stretch_tsibble(.init = 3, .step = 1)

Rob Hyndman Rob Hyndman · Accepted Answer · 2020-11-20T00:28:22

I can't comment on that particular data set as you haven't shared it, or even said what packages have been loaded. However a couple of points can be made:

Your initial slice is 3 observations. You can't fit an ETS or ARIMA model with 3 observations, so you will get warnings. Warnings will also arise for other tiny slices. I would suggest you start with at least a dozen observations for a monthly data set.
The final warnings are because you have incomplete out-of-sample data -- that is you are forecasting 1 year ahead and some of your slices involve data that includes the last year of observations. So you can't compare forecasts with actuals when the actuals are unknown.

Here is an example that works with monthly data.

library(fpp3)

test <- USAccDeaths %>% as_tsibble()

test_tr <- test %>%
  slice(1:(n()-1)) %>%
  stretch_tsibble(.init = 12, .step = 1)

fc <- test_tr %>%
  model(ets = ETS(value),
        arima =   ARIMA(value),
        rw = RW(value ~ drift()),
        ) %>%
  forecast(h = "1 year")

fc %>% accuracy(test)

Cross validation of monthly time series using fable package

Model evaluation

1 Answers