1
votes

I have the following vector, which contains data for each day of December.

vector1 <- c(1056772, 674172, 695744, 775040, 832036,735124,820668,1790756,1329648,1195276,1267644,986716,926468,828892,826284,749504,650924,822256,3434204,2502916,1262928,1025980,1828580,923372,658824,956916,915776,1081736,869836,898736,829368)

Now I want to create a time series object on a weekly basis and used the following code snippet:

weeklyts = ts(vector1,start=c(2016,12,01), frequency=7)

However, the starting and end points are not correct. I always get the following time series:

> weeklyts
Time Series:
Start = c(2017, 5) 
End = c(2021, 7) 
Frequency = 7 
 [1] 1056772  674172  695744  775040  832036  735124  820668 1790756 1329648 1195276 1267644  986716  926468  828892  826284  749504
[17]  650924  822256 3434204 2502916 1262928 1025980 1828580  923372  658824  956916  915776 1081736  869836  898736  829368

Does anybody nows what I am doing wrong?

1

1 Answers

3
votes

To get a timeseries that starts and ends as you would expect, you need to think about the timeserie. You have 31 days from december 2016.

The timeserie start option handles 2 numbers, not 3. So something like c(2016, 1) if you start with month 1 in 2016. See following example.

ts(1:12, start = c(2016, 1), frequency = 12) 
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2016   1   2   3   4   5   6   7   8   9  10  11  12

Now ts and daily data is an annoyance. ts cannot handle leap years. That is why you see people using a frequency of 365.25 to get an annual timeseries. To get a good december 2016 series we can do the following:

ts(vector1, start = c(2016, 336), frequency = 366)
Time Series:
Start = c(2016, 336) 
End = c(2016, 366) 
Frequency = 366 
 [1] 1056772  674172  695744  775040  832036  735124  820668 1790756 1329648 1195276 1267644  986716  926468  828892  826284  749504
[17]  650924  822256 3434204 2502916 1262928 1025980 1828580  923372  658824  956916  915776 1081736  869836  898736  829368

Note the following things that are going on:

  1. Frequence is 366 because 2016 is a leap year
  2. start is c(2016, 336), because 336 is the day in the year on "2016-12-01"

Personally I use xts package (and zoo) to handle daily data and use the functions in xts to aggregate to weekly timeseries. These can then be used with packages that like ts timeseries like forecast.

edit: added small xts example

my_df <- data.frame(dates = seq.Date(as.Date("2016-12-01"), as.Date("2017-01-31"), by = "day"),
                    var1 = rep(1:31, 2))

library(xts)
my_xts <- xts(my_df[, -1], order.by = my_df$dates)

# rollup to weekly. Dates shown are the last day in the weekperiod.
my_xts_weekly <- period.apply(my_xts, endpoints(my_xts, on = "weeks"), colSums)
head(my_xts_weekly)
           [,1]
2016-12-04   10
2016-12-11   56
2016-12-18  105
2016-12-25  154
2017-01-01  172
2017-01-08   35

Depending on your needs you can transform this back into data.frames etc etc. Read the help for period.apply as you can specify your own functions in the rolling mechanism. And read the xts (and zoo) vignettes.