To find the number of days in a month you can use the days_in_month
function in the lubridate package.
The argument takes a datetime object so you have to convert your Date
column to a known date/datetime-based class (i.e. "POSIXct, POSIXlt, Date, chron, yearmon, yearqtr, zoo, zooreg, timeDate, xts, its, ti, jul, timeSeries, and fts objects").
Then you can just mutate
your df with the multiplicated daily averages.
library(lubridate)
library(dplyr)
myDf <- read.table(text = "Date DE VE
12/1/2016 93.387 0.095
11/1/2016 77.968 0.095
10/1/2016 65.184 0.095
9/1/2016 63.984 0.095
8/1/2016 67.657 0.095", header = TRUE)
mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"),
monthlyTotalDE = DE * days_in_month(Date),
monthlyTotalVE = VE * days_in_month(Date))
# Date DE VE monthlyTotalDE monthlyTotalVE
# 1 2016-12-01 93.387 0.095 2894.997 2.945
# 2 2016-11-01 77.968 0.095 2339.040 2.850
# 3 2016-10-01 65.184 0.095 2020.704 2.945
# 4 2016-09-01 63.984 0.095 1919.520 2.850
# 5 2016-08-01 67.657 0.095 2097.367 2.945
EDIT
In mutate
if you use a new column name, it will append this column to the data frame.
If you want to avoid to add columns, you have to keep the columns names that already exist, it will overwrite these columns e.g.
mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"),
DE = DE * days_in_month(Date),
VE = VE * days_in_month(Date))
# Date DE VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945
If you have a lot of columns to compute, I suggest you to use mutate_each
, it's very powerfull and will save you the pain to do it manualy with mutate
or the loss of performance by doing a traditional loop.
Use vars
to include/exclude variables in mutate
.
You can exclude variables manualy using the variable name prececed by a minus :
vars = -Date
or use a vector to exclude several variables vars = c(Date, DE)
.
Or you can also use special specification functions as in dplyr::select, see ?dplyr::select for more informations.
Warning : If you use vars
to include variables, don't explicit the named argument vars =
in your function if you want to keep the column names.
one_of(c("DE", "VE"))
, DE:VE
... To drop variables, use -
before the function : -contains("Date")
myDf %>%
mutate(Date = as.Date(Date, format = "%m/%d/%Y")) %>%
mutate_each(funs(. * days_in_month(Date)),
vars = -Date)
# Date DE VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945
%m/%d/%Y
, right? – akrun