I am using the the aggts() function from the hts package to aggregate my hierarchical time series. The function replaces NAs by zeros before the time series are aggregated. This is useful if at least one of the observations is not NA. But if all observations for the given time are NA I want to keep NA instead of 0.
Edit (working example):
library(hts)
df <- data.frame(
AB = c(5, 10, 15, NA, 25, 30, NA, 40)
, AA = c(10, 20, 30, NA, 50, 60, 70, 80)
)
hts_object <- hts(df)
> aggts(hts_object)
Time Series:
Start = 1
End = 8
Frequency = 1
Total AB AA
1 15 5 10
2 30 10 20
3 45 15 30
4 0 0 0
5 75 25 50
6 90 30 60
7 70 0 70
8 120 40 80
But what I need is:
> aggts(hts_object)
Time Series:
Start = 1
End = 8
Frequency = 1
Total AB AA
1 15 5 10
2 30 10 20
3 45 15 30
4 NA NA NA
5 75 25 50
6 90 30 60
7 70 NA 70
8 120 40 80
Edit2 (after updating 'hts' package):
> aggts(hts_object)
Time Series:
Start = 1
End = 8
Frequency = 1
Total AB AA
1 15 5 10
2 30 10 20
3 45 15 30
4 NA NA NA
5 75 25 50
6 90 30 60
7 NA NA 70
8 120 40 80
This is not what I was expecting. Maybe this will be more clear with some background information. Due to Covid-19 I have to flag several monthly data points as outliers. If the observations across all hierarchy levels are NAs, I would like to keep the NAs after aggregating the time series. But if not all observations at a specific hierarchy level are NAs the sum is required.
My real life business examples are:
global outliers for all hierarchy levels (like for Covid-19)
--> all aggregated time series should contain NA if all bottom time series are NA
products with different market entry time (some time series have leading NAs)
--> aggregated levels require sum(na.rm = TRUE)
classic missing observations
--> aggregated levels require sum(na.rm = TRUE) and maybe interpolating is required beforehand