1
votes

I have two data sets. The first one is shows the numbers of dengue cases in week. Here is first rows of the data set:

season season_week week_start_date denv1_cases denv2_cases denv3_cases denv4_cases
1 1990/1991           1      1990-04-30           0           0           0           0
2 1990/1991           2      1990-05-07           0           0           0           0
3 1990/1991           3      1990-05-14           0           0           0           0
4 1990/1991           4      1990-05-21           0           0           0           0
5 1990/1991           5      1990-05-28           0           0           0           0
6 1990/1991           6      1990-06-04           1           0           0           0
  other_positive_cases additional_cases total_cases
1                    4                0           4
2                    5                0           5
3                    4                0           4
4                    3                0           3
5                    6                0           6
6                    1                0           2

The second column shows the week number of the dengue season and the third column shows the start date of that week. I have another data set that includes weather data. Here is the first rows of the data set:

 TMAX TMIN TAVG TDTR PRCP       date
1 26.7 20.6 23.7  6.1  1.3 1956-01-01
2 25.6 21.1 23.4  4.5 20.8 1956-01-02
3 26.7 21.7 24.2  5.0  1.8 1956-01-03
4 26.7 19.4 23.0  7.3  0.0 1956-01-04
5 27.8 17.2 22.5 10.6  0.0 1956-01-05
6 26.1 21.1 23.6  5.0  0.3 1956-01-06

I want to convert this data set from daily to weekly view with average of all rows and merge with the dengue case data set. But I cannot find the way to convert daily data to weekly. How to solve this?

1

1 Answers

1
votes

I'm assuming that your weekly start dates in your first data set are on Sundays. Given that assumption, we can use floor_date from the package lubridate

require(lubridate)
require(dplyr)

df2 <- df2 %>%
    mutate(date = floor_date(ymd(date), unit = "weeks")) %>%
    group_by(date) %>%
    summarise_all(.funs = mean)

make sure that all your columns (aside from the date) are numeric.

Then you're free to join it back to df1:

df3 <- d1 %>%
    left_join(df2, by = c("week_start_date" = "date"))

Hope this helps!