1
votes

I have data, which has multiple monthly variables. I would like to aggregate these variables to quarterly level. My initial data is:

Time           A      B      C     D  . . . . . K
Jan-2004       42     57     53    28
Feb-2004       40     78     56    28
Mar-2004       68     77     53    20
Apr-2004       97     96     80    16
May-2004       84     93     76    17
Jun-2004       57     100    100   21
Jul-2004       62     100    79    22
. 
.
.
.
N

So the goal is calculate quarters as monthly averages (sum(jan+feb+mar)/3)). In other words, the goal is to end up data like this:

Time        A       B      C       D    . . . . . K
2004Q1      50,0   70,7   54,0    25,3
2004Q2      79,3   96,3   85,3    18,0
2004Q3
.
.
.
N

Could anyone help me with this problem?

Thank you very much.

2
Can you include reproducible data with dput?divibisan

2 Answers

1
votes

An option would be to convert the 'Time' to yearqtr class with as.yearqtr from zoo and do a summarise_all

library(zoo)
library(dplyr)
df1 %>% 
    group_by(Time = format(as.yearqtr(Time, "%b-%Y"), "%YQ%q")) %>%
    summarise_all(mean)
# A tibble: 3 x 5
#  Time       A     B     C     D
#  <chr>  <dbl> <dbl> <dbl> <dbl>
#1 2004Q1  50    70.7  54    25.3
#2 2004Q2  79.3  96.3  85.3  18  
#3 2004Q3  62   100    79    22  

data

df1 <- structure(list(Time = c("Jan-2004", "Feb-2004", "Mar-2004", "Apr-2004", 
"May-2004", "Jun-2004", "Jul-2004"), A = c(42L, 40L, 68L, 97L, 
84L, 57L, 62L), B = c(57L, 78L, 77L, 96L, 93L, 100L, 100L), C = c(53L, 
56L, 53L, 80L, 76L, 100L, 79L), D = c(28L, 28L, 20L, 16L, 17L, 
21L, 22L)), class = "data.frame", row.names = c(NA, -7L))
1
votes

data.table has quarter function, you can do:

library(data.table)
setDT(my_data)

my_data[ , lapply(.SD, mean), by = .(year = year(Time), quarter = quarter(Time))]

This is the gist of it. Getting it to work exactly would require a reproducible example.