1
votes

I have been given a dataset for 118 days. I'm supposed to forecast the values for the next 28 days. I've tried out the below code. But I'm getting the same values for all the 28 days. Can you help me find my mistake? Thank you.

library(forecast)
library(dplyr)
head(product)
ts_product = ts(product$Qty, start=1,frequency=1)
ts_product
plot(ts_product)
#predictions of 28 days
m_ets = ets(ts_product)
f_ets = forecast(m_ets, h=28)
plot(f_ets)

The data for Qty is given by:

Qty = c(53, 40, 37, 45, 69, 105, 62, 101, 104, 46, 92, 157, 133, 173, 139, 163, 145, 154, 245, 147, 85, 131, 228, 192, 240, 346, 267, 267, 243, 233, 233, 244, 241, 136, 309, 236, 310, 266, 280, 321, 349, 335, 410, 226, 391, 314, 250, 368, 282, 203, 250, 233, 233, 277, 338, 279, 279, 266, 253, 178, 238, 126, 279, 258, 350, 277, 226, 287, 180, 268, 191, 279, 214, 133, 292, 212, 307, 232, 165, 107, 121, 188, 198, 154, 128, 85, 106, 67, 63, 88, 107, 56, 41, 59, 27, 58, 80, 75, 93, 54, 14, 36, 107, 82, 83, 112, 37, 57, 9, 51, 47, 57, 68, 97, 25, 45, 69, 89)

This is the prediction I get.

Point Forecast      Lo 80    Hi 80      Lo 95    Hi 95
119       69.53429   2.089823 136.9788  -33.61312 172.6817
120       69.53429  -2.569107 141.6377  -40.73834 179.8069
121       69.53429  -6.944751 146.0133  -47.43031 186.4989
122       69.53429 -11.083248 150.1518  -53.75959 192.8282
123       69.53429 -15.019428 154.0880  -59.77946 198.8480
124       69.53429 -18.780346 157.8489  -65.53129 204.5999
125       69.53429 -22.387517 161.4561  -71.04798 210.1166
126       69.53429 -25.858385 164.9270  -76.35622 215.4248
127       69.53429 -29.207323 168.2759  -81.47798 220.5466
128       69.53429 -32.446345 171.5149  -86.43163 225.5002
129       69.53429 -35.585612 174.6542  -91.23273 230.3013
130       69.53429 -38.633808 177.7024  -95.89454 234.9631
131       69.53429 -41.598429 180.6670 -100.42854 239.4971
132       69.53429 -44.485993 183.5546 -104.84468 243.9133
133       69.53429 -47.302214 186.3708 -109.15172 248.2203
134       69.53429 -50.052133 189.1207 -113.35736 252.4259
135       69.53429 -52.740222 191.8088 -117.46844 256.5370
136       69.53429 -55.370474 194.4391 -121.49106 260.5596
137       69.53429 -57.946468 197.0150 -125.43070 264.4993
138       69.53429 -60.471431 199.5400 -129.29230 268.3609
139       69.53429 -62.948280 202.0169 -133.08032 272.1489
140       69.53429 -65.379664 204.4482 -136.79880 275.8674
141       69.53429 -67.768000 206.8366 -140.45144 279.5200
142       69.53429 -70.115495 209.1841 -144.04163 283.1102
143       69.53429 -72.424177 211.4928 -147.57245 286.6410
144       69.53429 -74.695908 213.7645 -151.04676 290.1153
145       69.53429 -76.932409 216.0010 -154.46719 293.5358
146       69.53429 -79.135268 218.2038 -157.83618 296.9048

Also, do you think any other model other than ets, which we have used here will work for this problem ?

2
Post some data via 'dput'coatless
@Coatless Thank you for the suggestion. I have added the data as requested.Raj

2 Answers

6
votes

Understanding ets()

The ets() function is an exponential smoothing technique for state space models. By default, the ets() function will attempt to automatically fit a model to a time series via model = 'ZZZ' using the supplied frequency= parameter. This is particularly problematic as an incorrectly specified frequency= will cause a non-ideal model to be generate w.r.t to seasonality yielding the flat estimates.

Seasonalities

You may think that one should specify frequency=1 within a ts() object for daily data. However, that is an incorrect way to go about it. In fact, the correct way to specify frequency= is to understand R's "unique" definition:

The frequency is the number of observations per season.

Thus, we need to care about the seasonality of your data.

There are two guiding tables to consult.

The first is a macro view:

Data    Frequency
Annual      1
Quarterly   4
Monthly     12
Weekly      52

The second is a micro view:

Data    Frequencies             
         Minute  Hour   Day   Week   Year
Daily                          7     365.25
Hourly                    24  168    8766
Half-​​hourly               48  336    17532
Min­utes             60   1440 10080  525960
Sec­onds      60   3600  86400 604800 31557600

There are two seasonalities (e.g. frequency= options) to consider with daily data:

7 (weekly) and 365.25 (daily)

For more information see: Seasonal periods

Revisiting the estimation

The reason why ets() is not working appropriately is due to the seasonality used. (e.g. frequency = 1). By changing it based on the above, we get:

# Changed the frequency to 7
ts_product = ts(product$Qty, start=1, frequency=7)

# Predictions of 28 days
m_ets <- ets(ts_product)
f_ets <- forecast(m_ets, h = 28)
plot(f_ets)

freq_7

Alternative models

There are two other models worth looking into briefly: HoltWinters() and auto.arima(). Discussion for is available for the prior: HoltWinters vs. ets

hw = HoltWinters(ts_product)
f_ets = predict(hw, n.ahead = 28, prediction.interval = T, level = 0.95)
plot(hw, f_ets)

HoltWinters

The ARIMA generated by running auto.arima():

aa = auto.arima(ts_product)
f_ets = forecast(aa, h = 28)
plot(f_ets)

auto_arima

Misc data note

Briefly looking at your data under:

ts_product = ts(a, start=1, frequency=1)
plot(ts_product)

ts_graph

Note, there is a relatively large disturbance between times 18-85 that would cause a model to be considered non-stationary. You may wish to first try differencing it out via diff() and then repeat the above.

In addition, you may want to try to obtain a full year's worth of data instead of only 118 days.

0
votes

Take a look at ?arima. For example:

mar=arima(product$Qty,order = c(1,0,1))
f_ar=forecast(mar, h=28)
plot(f_ar)

enter image description here

Your data appears to have seasonality, try to use that information in the ets or arima models.