1
votes
Date        DE      VE
12/1/2016   93.387  0.095
11/1/2016   77.968  0.095
10/1/2016   65.184  0.095
9/1/2016    63.984  0.095
8/1/2016    67.657  0.095
%m/%d/%Y

DE and VE are daily averages. How to convert from daily average to monthly total in R based on the actual days in that month? Total for 12/2016 =93.387*31. Need to calculate the monthly total for all 10*12 months from 2006-01 to 2016-12.

1
I guess the format is %m/%d/%Y, right?akrun
yes..you are right. sorry for the confusion.jkl
@jkl If the solution in the link helps, we can close this as duplicateakrun
I do not think they are the same questions. It has more than 10 years data. I want to calculate monthly total for 2001-01 to 2016-12jkl

1 Answers

4
votes

To find the number of days in a month you can use the days_in_month function in the lubridate package.

The argument takes a datetime object so you have to convert your Date column to a known date/datetime-based class (i.e. "POSIXct, POSIXlt, Date, chron, yearmon, yearqtr, zoo, zooreg, timeDate, xts, its, ti, jul, timeSeries, and fts objects").

Then you can just mutate your df with the multiplicated daily averages.

library(lubridate)
library(dplyr)

myDf <- read.table(text = "Date        DE      VE
12/1/2016   93.387  0.095
11/1/2016   77.968  0.095
10/1/2016   65.184  0.095
9/1/2016    63.984  0.095
8/1/2016    67.657  0.095", header = TRUE)

mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"), 
       monthlyTotalDE = DE * days_in_month(Date),
       monthlyTotalVE = VE * days_in_month(Date))

#             Date     DE    VE monthlyTotalDE monthlyTotalVE
#     1 2016-12-01 93.387 0.095       2894.997          2.945
#     2 2016-11-01 77.968 0.095       2339.040          2.850
#     3 2016-10-01 65.184 0.095       2020.704          2.945
#     4 2016-09-01 63.984 0.095       1919.520          2.850
#     5 2016-08-01 67.657 0.095       2097.367          2.945

EDIT

In mutate if you use a new column name, it will append this column to the data frame. If you want to avoid to add columns, you have to keep the columns names that already exist, it will overwrite these columns e.g.

mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"), 
       DE = DE * days_in_month(Date), 
       VE = VE * days_in_month(Date))

#         Date       DE    VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945

If you have a lot of columns to compute, I suggest you to use mutate_each, it's very powerfull and will save you the pain to do it manualy with mutate or the loss of performance by doing a traditional loop.

Use vars to include/exclude variables in mutate.

You can exclude variables manualy using the variable name prececed by a minus : vars = -Date or use a vector to exclude several variables vars = c(Date, DE).

Or you can also use special specification functions as in dplyr::select, see ?dplyr::select for more informations.

Warning : If you use vars to include variables, don't explicit the named argument vars = in your function if you want to keep the column names.

one_of(c("DE", "VE")), DE:VE... To drop variables, use - before the function : -contains("Date")

myDf %>%
mutate(Date = as.Date(Date, format = "%m/%d/%Y")) %>%
mutate_each(funs(. * days_in_month(Date)),
            vars = -Date)

#         Date       DE    VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945