0
votes

I'm trying to load time series in R with the 'zoo' library.

The observations I have varying precision. Some have the day/month/year, others only month and year, and others year:

02/10/1915
1917
07/1917
07/1918
30/08/2018

Subsequently, I need to aggregate the rows by year, year and month. The basic R as.Date function doesn't handle that. How can I model this data with zoo?

Thanks, Mulone

1
How do you will aggregate by month for data without month?agstudy
I will just exclude it. There is an easy solution that consists of creating three separate fields and treat them differently, but I was wondering if zoo had some functionality to do this.Mulone

1 Answers

2
votes

We use the test data formed from the index data in the question followed by a number:

# test data
Lines <- "02/10/1915 1
1917 2
07/1917 3
07/1918 4
30/08/2018 5"

yearly aggregation

library(zoo)
to.year <- function(x) as.numeric(sub(".*/", "", as.character(x)))
read.zoo(text = Lines, FUN = to.year, aggregate = mean)

The last line returns:

1915 1917 1918 2018 
 1.0  2.5  4.0  5.0 

year/month aggregation

Since year/month aggregation of data with no months makes no sense we first drop the year only data and aggregate the rest:

DF <- read.table(text = Lines, as.is = TRUE)

# remove year-only records.  DF.ym has at least year and month.
yr <- suppressWarnings(as.numeric(DF[[1]]))
DF.ym <- DF[is.na(yr), ]

# remove day, if present, and convert to yearmon.
to.yearmon <- function(x) as.yearmon( sub("\\d{1,2}/(\\d{1,2}/)", "\\1", x), "%m/%Y" )

read.zoo(DF.ym, FUN = to.yearmon, aggregate = mean)

The last line gives:

Oct 1915 Jul 1917 Jul 1918 Aug 2018 
       1        3        4        5 

UPDATE: simplifications