1
votes

I'm currently working with a time series that has counts on a weekly basis, but with denominators only available as a monthly total. I'd like to merge them to work with rates.

A year of the data looks something like this: 173 173 169 180 173 167 187 175 174 154 163 163
Starting January 2010 and going to December.

I'd use something like PROC EXPAND in SAS normally, but with the goal of trying to learn R I'm trying to implement this project entirely in R.

It seems like I should use something like the following to interpolate those values as a cubic spline using zoo, but after several attempts, I've run aground. I've got two implementation questions I'm hoping people can help with:

  1. Using something like na.spline will give me interpolated values, but they're the wrong ones as far as I can tell. The data above isn't a single unit of time, sampled once a month. It's the month total, so a weekly series should have four interpolated points ~1/4th the magnitude.

  2. Is there any methods that use the actual dates involved? While it makes sense in the abstract to expand a monthly series to a weekly series by creating four new entries, not all months have four weeks in them.

In SAS, for reference, it would involve something like the following:

PROC EXPAND data=work.denom out=work.weekly from=month to=week;
    ID month;
    CONVERT denominator / method=SPLINE observed=TOTAL;
run;

Where month is a timeseries index for a particular date (I picked the 1st of the month), and denominator is the variable being expanded from a monthly to a weekly series.

1
Do you have any reason to believe that an intraweek cycle exists within the data, e.g., daily counts decrease from Monday to Wednesday, and then increase from Thursday to Sunday? If you did, you could make an educated guess about how to partition the month's total across its days.Jubbles
@Jubbles If there is such a cycle, I don't know it.Fomite
Can you post your weekly count data?Jubbles
@Jubbles Sadly, no. All I've got right now are the monthly versions - expanding out the denominator was something I am working on in parallel to the weekly data being pulled.Fomite
Have a look at the tempdisagg package (cran.r-project.org/web/packages/tempdisagg)Rob Hyndman

1 Answers

0
votes

You can disaggregate time series using tempdisagg package. At first you disaggregate using da function to daily sampled time series, then you can aggregate into weekly using xts's apply.weekly function. Please see the code below:

library(tempdisagg)
library(lubridate)
library(xts)

df <- data.frame(
  date =  ymd("2010-01-01") + months(0:11),
  value = c(173, 173, 169, 180, 173, 167, 187, 175, 174, 154, 163, 163)
)

# Disaggregate to daily
df_da <- td(df ~ 1, to = "daily", method = "denton-cholette", conversion = "sum")$values

# Aggregate to weekly
data <- as.xts(df_da$value, order.by = df_da$time)
weekly <- apply.weekly(data,sum)

plot(weekly, type = "p")

Output (you can see "outliers" at the first and last weeks of 2010 as those weeks were not fulll during that year, e.g. Jan 4, 2010 was Monday):

enter image description here