1
votes

I have read in and formatted my data set like shown under.

library(xts)

#Read data from file
x <- read.csv("data.dat", header=F)
x[is.na(x)] <- c(0)                        #If empty fill in zero

#Construct data frames
rawdata.h <- data.frame(x[,2],x[,3],x[,4],x[,5],x[,6],x[,7],x[,8]) #Hourly data
rawdata.15min <- data.frame(x[,10])                                #15 min data

#Convert time index to proper format
index.h <- as.POSIXct(strptime(x[,1], "%d.%m.%Y %H:%M"))      
index.15min <- as.POSIXct(strptime(x[,9], "%d.%m.%Y %H:%M"))

#Set column names
names(rawdata.h) <- c("spot","RKup", "RKdown","RKcon","anm", "pp.stat","prod.h")
names(rawdata.15min) <- c("prod.15min")                                                                   

#Convert data frames to time series objects
data.htemp <- xts(rawdata.h,order.by=index.h)
data.15mintemp <- xts(rawdata.15min,order.by=index.15min)

#Select desired subset period
data.h <- data.htemp["2013"]                 
data.15min <- data.15mintemp["2013"]

I want to be able to combine hourly data from data.h$prod.h with data, with 15 min resolution, from data.15min$prod.15min corresponding to the same hour.

An example would be to take the average of the hourly value at time 2013-12-01 00:00-01:00 with the last 15 minute value in that same hour, i.e. the 15 minute value from time 2013-12-01 00:45-01:00. I'm looking for a flexible way to do this with an arbitrary hour.

Any suggestions?

Edit: Just to clarify further: I want to do something like this:

N <- NROW(data.h$prod.h)

for (i in 1:N){

prod.average[i] <- mean(data.h$prod.h[i] + #INSERT CODE THAT FINDS LAST 15 MIN IN HOUR i )

}
3
Can you provide a sample of your data with an example of desired result? Do you have the same number of rows for hourly and 15 min data? And you want to take only the last 15 minutes for each hour?Ram Narasimhan
I posted an edit with a potential solution I found.tfunk
See ?to.period, ?merge.xts, ?na.locf, ?period.apply. maybe this helps. If not, then making your data reproducible would go a long way towards getting Answers.GSee
Thanks for the input! I see now that there are several ways to go about this, e.g. to.period pretty much does the same as my solution (see last EDIT).tfunk
You are allowed to answer your own questions. You could leave the question part above, and move the solution that worked as your answer. That way the question can be marked as complete.Ram Narasimhan

3 Answers

1
votes

I found a solution to my problem by converting the 15 minute data into hourly data using the very useful .index* function from the xts package like shown under.

prod.new <- data.15min$prod.15min[.indexmin(data.15min$prod.15min) %in% c(45:59)]

This creates a new time series with only the values occuring in the 45-59 minute interval each hour.

For those curious my data looked like this:

Original hourly series:

> data.h$prod.h[1:4]
2013-01-01 00:00:00 19.744
2013-01-01 01:00:00 27.866
2013-01-01 02:00:00 26.227
2013-01-01 03:00:00 16.013

Original 15 minute series:

> data.15min$prod.15min[1:4]
2013-09-30 00:00:00    16.4251
2013-09-30 00:15:00    18.4495
2013-09-30 00:30:00     7.2125
2013-09-30 00:45:00    12.1913
2013-09-30 01:00:00    12.4606
2013-09-30 01:15:00    12.7299
2013-09-30 01:30:00    12.9992
2013-09-30 01:45:00    26.7522

New series with only the last 15 minutes in each hour:

> prod.new[1:4]
2013-09-30 00:45:00    12.1913
2013-09-30 01:45:00    26.7522
2013-09-30 02:45:00     5.0332
2013-09-30 03:45:00     2.6974
1
votes

Short answer

df %>%
   group_by(t = cut(time, "30 min")) %>%
   summarise(v = mean(value))

Long answer

Since, you want to compress the 15 minutes time series to a smaller resolution (30 minutes), you should use dplyr package or any other package that computes the "group by" concept.

For instance:

s  = seq(as.POSIXct("2017-01-01"), as.POSIXct("2017-01-02"), "15 min")
df = data.frame(time = s, value=1:97)

df is a time series with 97 rows and two columns.

head(df)
                 time value
1 2017-01-01 00:00:00     1
2 2017-01-01 00:15:00     2
3 2017-01-01 00:30:00     3
4 2017-01-01 00:45:00     4
5 2017-01-01 01:00:00     5
6 2017-01-01 01:15:00     6

The cut.POSIXt, group_by and summarise functions do the work:

df %>%
   group_by(t = cut(time, "30 min")) %>%
   summarise(v = mean(value))

                    t     v
1 2017-01-01 00:00:00   1.5
2 2017-01-01 00:30:00   3.5
3 2017-01-01 01:00:00   5.5
4 2017-01-01 01:30:00   7.5
5 2017-01-01 02:00:00   9.5
6 2017-01-01 02:30:00  11.5
0
votes

A more robust way is to convert 15 minutes values into hourly values by taking average. Then do whatever operation you want to.

### 15 Minutes Data
min15 <- structure(list(V1 = structure(1:8, .Label = c("2013-01-01 00:00:00", 
                                                       "2013-01-01 00:15:00", "2013-01-01 00:30:00", "2013-01-01 00:45:00", 
                                                       "2013-01-01 01:00:00", "2013-01-01 01:15:00", "2013-01-01 01:30:00", 
                                                       "2013-01-01 01:45:00"), class = "factor"), V2 = c(16.4251, 18.4495, 
                                                                                                         7.2125, 12.1913, 12.4606, 12.7299, 12.9992, 26.7522)), .Names = c("V1", 
                                                                                                                                                                           "V2"), class = "data.frame", row.names = c(NA, -8L))
min15

### Hourly Data
hourly <- structure(list(V1 = structure(1:4, .Label = c("2013-01-01 00:00:00", 
                                                        "2013-01-01 01:00:00", "2013-01-01 02:00:00", "2013-01-01 03:00:00"
), class = "factor"), V2 = c(19.744, 27.866, 26.227, 16.013)), .Names = c("V1", 
                                                                          "V2"), class = "data.frame", row.names = c(NA, -4L))


hourly

### Convert 15min data into hourly data by taking average of 4 values
min15$V1 <- as.POSIXct(min15$V1,origin="1970-01-01 0:0:0")
min15 <- aggregate(. ~ cut(min15$V1,"60 min"),min15[setdiff(names(min15), "V1")],mean)

min15

names(min15) <- c("time","min15")
names(hourly) <- c("time","hourly")


### merge the corresponding values
combined <- merge(hourly,min15)

### average of hourly and 15min values
rowMeans(combined[,2:3])