I have found out that my data set is not consistent. The data Frame is called DF1 For instance there is a NA value for time 9:49, but time 9:48 does not exist at all. It Looks like this:
Time | 1 | 2
2016-05-11 09:45:00 | NA | NA
2016-05-11 09:46:00 | 4.4 | 6.6
2016-05-11 09:47:00 | 5.8 | 7.0
2016-05-11 09:49:00 | NA | NA
2016-05-11 09:50:00 | 5.6 | 7.3
2016-05-11 09:51:00 | 7.4 | 7.4
So I have created a new data Frame comprising all Dates and minutes consistently:
D2 = as.data.frame( seq( from = as.POSIXct("2016-05-11 09:45", tz = "GMT"), to = as.POSIXct("2016-05-11 09:50", tz = "GMT"), by = "min"))
How can I merge D2 and DF1 such that the values in column 1 and 2 at time 9:48 will turn into NA values. It should look like this:
Time | 1 | 2
2016-05-11 09:45:00 | NA | NA
2016-05-11 09:46:00 | 4.4 | 6.6
2016-05-11 09:47:00 | 5.8 | 7.0
2016-05-11 09:48:00 | NA | NA
2016-05-11 09:49:00 | NA | NA
2016-05-11 09:50:00 | 5.6 | 7.3
2016-05-11 09:51:00 | 7.4 | 7.4
In the next step I want to replace the NA values. If it is one NA value it should be replaced by the mean of the previous element and the next one. If there are two NA values the two values should be replaced by the mean of the 2 previous and two next values. If there are three, up until four NA values. If there are more then 4 consecutive NA values the values must not be changed. My new Data should look like this:
Time | 1 | 2
2016-05-11 09:45:00 | NA | NA --> no previous value --> NA
2016-05-11 09:46:00 | 4.4 | 6.6
2016-05-11 09:47:00 | 5.8 | 7.0
2016-05-11 09:48:00 | 5.8 | 7.1 --> column 1 (4.4+5.8+5.6 + 7.4)/4
2016-05-11 09:49:00 | 5.8 | 7.1 --> column 2 (7.3+ 7.4 + 6.6 + 7.0)/4
2016-05-11 09:50:00 | 5.6 | 7.3
2016-05-11 09:51:00 | 7.4 | 7.4