I'm attempting to use the timeSeries package in R to aggregate data from a timeSeries object. I wrote some basic sample code for reference:
library(timeSeries)
library(timeDate)
BD <- as.timeDate(paste("2015-01-01", "00:00:00")) # Creates a timeDate.
ED <- as.timeDate(paste("2015-01-31", "23:59:00")) # Creates a timeDate.
DR <- seq(BD, ED, by = 60) # Creates a sequence by minutes in between the 2 dates.
data <- runif(length(DR), 0, 100) # Creating random sample data.
x <- timeSeries(data, DR) # Initializing a timeSeries object from data and DR.
colnames(x) <- "Data" # Renaming column.
by = timeSequence(BD, ED, by = "hour") # Setting the sequence to be aggregated on.
x.agg <- timeSeries::aggregate(x, by, sum) # Aggregating on that sequence.
After running the code my head looks like this:
> head(x.agg)
GMT
Data
2015-01-01 00:00:00 29.71688
2015-01-01 01:00:00 3129.84860
2015-01-01 02:00:00 2398.92438
2015-01-01 03:00:00 3134.78608
2015-01-01 04:00:00 2743.79543
2015-01-01 05:00:00 3159.38404
Notice that the first data, "2015-01-01 00:00:00" is significantly less than the other hourly sums, in fact it is exactly the same as the data point in the original data sample:
> head(x)
GMT
Data
2015-01-01 00:00:00 29.71688
2015-01-01 00:01:00 38.73175
2015-01-01 00:02:00 1.01945
2015-01-01 00:03:00 89.64938
2015-01-01 00:04:00 34.23608
2015-01-01 00:05:00 90.48571
Doing some investigating into where the sum is coming from, the aggregation for the "2015-01-01 01:00:00" hour is a summation of all the time in between (inclusive) "2015-01-01 00:01:00" and "2015-01-01 01:00:00" as shown code-wise here:
> sum(x[2:61,])
[1] 3129.849
> x.agg[2,]
GMT
Data
2015-01-01 01:00:00 3129.849
What I need is for the aggregation to sum across all the data points within the "00:00:00" hour, that is to say, the aggregation for "2015-01-01 00:00:00" should be equivalent with:
> sum(x[1:60,])
[1] 3065.829
including the first minute of that hour and not the first minute of the next hour like aggregation is doing. It seems to be that the aggregation function is considering the first minute of the hour to not be part of that hour, which I find very strange. Any help would be greatly appreciated.
?aggregate
. Something likeaggregate.ts(x,nfrequency = 1/60)
yield better results but it sill misses your objective. – DJJ