1
votes

I have a data frame (DF) with two columns. In column one I have dates, in column two I have my value of interest (VOI).

DF's display would be this:

|---------------------|------------------|
|        Date         |        VOI       |
|---------------------|------------------|
|          Jan-1971   |         34       |
|---------------------|------------------|
|          Jan-1972   |         28       |
|---------------------|------------------|
|          Jan-1973   |         29       |
|---------------------|------------------|
|          Jan-1974   |         37       |
|---------------------|------------------|
|             ...     |         ...      |
|---------------------|------------------|
|          Jan-2017   |         36       |
|---------------------|------------------|
|          Fev-1971   |         48       |
|---------------------|------------------|
|          Fev-1972   |         49       |
|---------------------|------------------|
|          Fev-1973   |         52       |
|---------------------|------------------|
|          Fev-1974   |         50       |
|---------------------|------------------|
|          ...        |         ...      |
|---------------------|------------------|
|          Mar-1971   |         30       |
|---------------------|------------------|
|          ...        |         ...      |
|---------------------|------------------|
|          Mar-2017   |         36       |
|---------------------|------------------|
|          ...        |         ...      |
|---------------------|------------------|
|          Dez-1971   |         15       |
|---------------------|------------------|
|          ...        |        ...       |
|---------------------|------------------|
|          Dez-2017   |         19       |
|---------------------|------------------|

In a nutshell, the data are presented in aggregated cycles of months.

First I have all the VOIs for January from 1971 to 2017 (47 data points), then I have all the VOIs for February of the same period, hence, the same amount of points. This repetition goes on until December, also with 47 data points.

I applied ymd() from the lubridate to transform my date into POSIXct values.

Now I wanted to create a time series object out of my VOIs. I tried:

ts = xts(x = df$Vazao, order.by = index(df$Date))

and

ts = xts(x = df$Vazao, order.by = df$Data)

but none worked. I don't know where I am making a mistake, but I wonder it has anything to do with the fact my dates don't come chronologically. I thought that using the ymd() command would sort that out and "make R understand" that my times series goes from Jan 1971, Feb 1971, Mar 1971, ..., Dec 2017.

How would I transform this data frame into a time series object?

Thank you for your input.

2
You should dput a sample of your data.Christoph

2 Answers

2
votes

Is this what you are looking for?

First, make up some data.

y <- 1971:2017
length(ano)
m <- seq(as.Date("2017-01-01"), as.Date("2017-12-31"), by = 28)
m <- unique(format(m, "%b"))
Date <- expand.grid(y, m)[2:1]
Date <- apply(Date, 1, paste, collapse = "-")
DF <- data.frame(Date = date, VOI = sample(100, length(date), TRUE))
head(DF)
#      Date VOI
#1 Jan-1971  12
#2 Jan-1972  89
#3 Jan-1973  99
#4 Jan-1974  77
#5 Jan-1975   5
#6 Jan-1976  46

Now, it's just a matter of applying function xts with the appropriate arguments. Note that your Date column doesn't have a day value, so I have to paste one. Day 01 is always a good choice.

library(xts)

ts <- xts(DF[, "VOI"], order.by = as.Date(paste0("01-", DF$Date), "%d-%b-%Y"))

str(ts)
#An ‘xts’ object on 1971-01-01/2017-12-01 containing:
#  Data: int [1:564, 1] 76 90 7 61 3 49 1 19 51 90 ...
#  Indexed by objects of class: [Date] TZ: UTC
#  xts Attributes:  
# NULL


head(ts)
           [,1]
#1971-01-01   76
#1971-02-01   90
#1971-03-01    7
#1971-04-01   61
#1971-05-01    3
#1971-06-01   49
1
votes

Since, your Date got only month and year for a date hence, you can use zoo::yearmon function to convert Date to class yearmon which is acceptable by xts function.

The expectation for the the order.by argument of xts is explained in help as:

An xts object extends the S3 class zoo from the package of the same name.

The first difference in this extension provides for a requirement that the index values not only be unique and ordered, but also must be of a time-based class. Currently acceptable classes include: Date, POSIXct, timeDate, as well as yearmon and yearqtr where the index values remain unique.

A solution can be as:

# Sample data. This data will have Date in `Jan-1971` format.
# Data has been created only for 36 months.  
set.seed(1)
df <- data.frame( Date = format(seq(as.Date("1971-01-01"), 
                     as.Date("1973-12-31"), by="month"), "%b-%Y"),
            VOI = as.integer(runif(36)*100), stringsAsFactors = FALSE)


library(zoo)    
library(xts)


#Convert Date column to type `yearmon`
ts = xts(x = df$VOI, order.by = as.yearmon(df$Date, "%b-%Y"))

head(ts)
# [,1]
# Jan 1971   26
# Feb 1971   37
# Mar 1971   57
# Apr 1971   90
# May 1971   20
# Jun 1971   89