2
votes

I have checked out several other similar questions but cannot quiet find any code that works for my data. I have 2 datasets (df1 and df2), one with a time interval (df1) and one with precipitation data (df2). I would like to get the total amount of precipitation for the time interval in df1. Because of all of the other data in df1 I cannot combine the 2 datasets, df1 each line corresponds to individual observation and I need the total rain for that observations time span.

df1 has date intervals;


  [1] 1969-06-18 UTC--NA             1972-06-19 UTC--NA             1989-06-18 UTC--NA            
  [4] 1992-06-13 UTC--NA             1993-06-17 UTC--1993-10-02 UTC 1997-06-21 UTC--1997-09-19 UTC

and df2 has the precipitation data per day (data from 1987 to 2018); head(df2)

 Date       rain_mm 

1  1987-06-01        0.0      
2  1987-06-02        0.0    
3  1987-06-03        0.0     
4  1987-06-04        0.0     
5  1987-06-05        6.0       
6  1987-06-06        6.4

How can I find the sum of rain fall during each time interval? I created a start date(df1$Date) and end date (df1$end) from the interval, then tried the following;

df1$rain <- NA #empty column for data

                df1$rain[i] <-sum(df2$rain_mm[which(
                                     df1$Date>= df2$Date[i] &
                                     df2$Date<= df1$end[i])])}

There were 50 or more warnings (use warnings() to see the first 50)

 df1$rain 
NULL

Warning message:
Unknown or uninitialised column: 'rain'.

The code ran but didn't seem to actually work. The biggest issue is getting the sum over a time interval. Any help is greatly appreciated.

1

1 Answers

1
votes

Finally solved it. For anyone interested in the answer...

df1$rain <- NA #empty column for data

for(i in 1:nrow(df1)) {
      s <- df1$Date[i]
      e <- df1$end[i]
      if(is.na(s)) {df1$rain[i] <- NA}
      if(is.na(e)) {df1$rain[i] <- NA}
      else{
      df1$rain[i] <- sum(df2$rain_mm[which(
      df2$Date >= s & 
      df2$Date <= e)], na.rm = TRUE)}
} 

...I also added, the output will be NA if either Date (s or e) is NA.