R: Fill in missing values depending on hour and day

Question

I use R and have a data table with 3 columns:

unixtime | average by hour| 15 seconds value

The data contains several days of a year and all hours of those days. In 1 hour I have 1 value for "average by hour" which is at the top row of this hour. Further, there are 240 values for "15 seconds value". I created a for loop which takes hours to solve the problem, but would solve it.

for (i in 2:nrow(merge_demand)){
  if (is.na(merge_demand[i,2])) {
    merge_demand[i,2] = merge_demand[i-1,2]
  }
}

Is there a more efficient way to just fill those 239 missing values of "average by hour" with the one existing value depending on this hour on this day? In total I have 1682761 rows.

I am kind of new to data tables so thanks for helping me out!

Hi welcome, could you please provide a sample of your data (using dput() is advised). Maybe also look at stackoverflow.com/help/minimal-reproducible-example — Annet

Tony Ladson Tony Ladson · Accepted Answer · 2019-12-15T20:49:57

It's likely to be quicker to use an indexing approach. Here is an idea that you will need to incorporate into a loop

# Generate sample data
my_data <- data.frame(unixtime = seq(from = ymd_hms('2000-01-01 00:00:15'),
                          by = '15 sec',
                          length.out = 240),
           average_by_hour = c(5, rep(NA, 239)),
           value_15_sec = c(rep(5/240, 240)))


#fill the first 240 values of average_by_hour with the first value
my_data$average_by_hour[1:240] <- my_data$average_by_hour[1]

R: Fill in missing values depending on hour and day

1 Answers