0
votes

I have data that looks like this:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   19 obs. of  7 variables:
 $ Week Ending  : chr  "5/1/18" "5/1/18" "5/1/18" "5/1/18" ...
 $ Agent        : chr  "telbenja ." "Tomsaint ." "davidlor ." "moniquec 
." ...
 $ Inbound      : int  25 62 44 36 1 22 144 36 28 51 ...
 $ Manual       : int  0 3 4 22 0 0 13 6 2 1 ...
 $ Avg Talk Time: 'hms' num  00:03:29 00:03:20 00:03:51 00:02:37 ...
  ..- attr(*, "units")= chr "secs"
 $ Avg Wrap Time: 'hms' num  00:01:57 00:01:13 00:01:31 00:01:24 ...
  ..- attr(*, "units")= chr "secs"
 $ Avg Hold Time: 'hms' num  00:00:11 00:00:02 00:00:02 00:00:00

This is just a sample, I have about 100,000 rows.

Ultimately what I need is to have an 'Agent' called 'Average' whose values in all other columns are just the average of all other rows within the same 'Week Ending' (date).

I believe the solution to this is some sort of group_by and summarize dplyr sorcery, however I can't seem to make this work for returning row values, group by and summarize would give me a brand new column, but thats not what I want, I need a new row entry for each date ('Week Ending') which features the mean of values in each column of the same date.

Any help with this is greatly appreciated (fully anticipating shock and horror for my wording/problem, hit the downvote if you are having trouble sleeping).

Result of dput(head(my_data)):

dput(head(response_codes))
structure(list(`Response Code` = structure(c(105L, 72L, 79L, 
159L, 104L, 17L), .Label = c("304001", "312001", "799007", "843001", 
"951001", "1490001", "1490002", "1524002", "1524003", "1620001", 
"1696001", "2297001", "2299001", "2302001", "2305001", "2312001", 
"2314001", "2315001", "2316001", "2317001", "2327001", "2328001", 
"2329001", "2330001", "2333001", "2374001", "2380002", "2415001", 
"2420001", "2428001", "2428004", "2428005", "2428006", "2434001", 
"2435002", "2444002", "2449002", "2457002", "2457003", "2462001", 
"2463001", "2463002", "2478001", "2586010", "2673002", "2677001", 
"2678002", "2682001", "2683002", "2835005", "2938001", "2950001", 
"2974001", "3006001", "3006002", "3007001", "3046001", "3077003", 
"3091001", "3093001", "3093010", "3094003", "3115001", "3115006", 
"3115010", "3116001", "3116003", "3117001", "3117002", "3148001", 
"3214001", "3239001", "3244001", "3245001", "3245002", "3245003", 
"3262001", "3262002", "3273001", "3276001", "3276002", "3276003", 
"3276005", "3276006", "3276012", "3276013", "3276017", "3276019", 
"3276020", "3276021", "3276023", "3276030", "3276036", "3276037", 
"3276038", "3276039", "3276043", "3276044", "3276045", "3276048", 
"3276050", "3289001", "3330001", "3334001", "3334002", "3347001", 
"3348001", "3361001", "3382001", "3383001", "3393001", "3394001", 
"3394002", "3399001", "3403005", "3486003", "3488003", "3491001", 
"3558001", "3584001", "3585002", "3586001", "3588001", "3591001", 
"3677002", "3677003", "3678001", "3678002", "3691003", "3691004", 
"3691005", "3691006", "3691009", "3691010", "3691014", "3692001", 
"3693002", "3694002", "3695002", "3741001", "3743001", "3753001", 
"3753002", "3755001", "3762001", "3765001", "3766001", "3767001", 
"3767002", "3768001", "3769001", "3771001", "3772001", "3792001", 
"3795001", "3797001", "3799001", "3800001", "3810001", "7014001", 
"7371007", "7445001", "9007001", "9009001"), class = "factor"), 
    `Total Recruits` = c(518L, 467L, 345L, 335L, 333L, 224L), 
    `Number of 2nd Purchase (Converts)` = c(217L, 248L, 181L, 
    106L, 218L, 150L), `Total Cms that took a wp on or after their recruitment case` = c(187L, 
    169L, 142L, 104L, 361L, 233L), `Currently Closed Wine Plans` = c(135L, 
    130L, 108L, 79L, 295L, 188L), `Currently Active Wine Plans` = c(52L, 
    39L, 34L, 25L, 66L, 45L), `Upgrade to WP %` = c(36.1, 36.19, 
    41.16, 31.04, 108.41, 104.02), `2nd Purchase Conversion Rate` = c(41.89, 
    53.1, 52.46, 31.64, 65.47, 66.96), `Number of Conti Cases Purchased` = c(232L, 
    208L, 171L, 108L, 449L, 353L), `Number of Distinct WP Customers` = c(94L, 
    101L, 84L, 51L, 193L, 141L)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))
1
It will be easier for people who want to help you if you can share some actual example data, not just its structure, e.g. by including the output of dput(head(your_table)) in your question. And even better, an example of expected output and any code you've tried.Jon Spring
It sounds like you want to add rows with the weekly average of each column. I presume something like your_data %>% group_by('Week Ending') %>% summarize_if(is.numeric, mean) to get your summary rows, and then something like bind_rows(your_data, summary_table). Can't answer any more specifically w/o example data.Jon Spring
Thank you for your reply! Thats exactly what I am after, though I would like to return a row value within my original dataframe for each weekly summary - the agent name as 'Average' and all other variables just summarized as the mean of that weeks data. For every week. Thanks again for your help with this!Davide Lorino
Is there a "Week ending" variable in this data? I'm not sure how you meant to group it.Jon Spring

1 Answers

1
votes
library(dplyr)

# 1. Adding fake week and Agent
response_codes <- response_codes %>%
  mutate(fake_week = rep(1:3, each = 2),
         Agent = letters[1:6])

# 2. Make summary by week
summarized <- response_codes %>%
  group_by(fake_week) %>%
  summarise_if(is.numeric, mean) %>%
  mutate(Agent = "Average")

# 3. Combine
combo <- bind_rows(response_codes, summarized)

Output

# Just showing columns 1-3 and 10-12:
> combo[, c(1:3,10:12)]
# A tibble: 9 x 6
  `Response Code` `Total Recruits` `Number of 2nd Purchase (Converts)` `Number of Distinct WP Customers` fake_week Agent  
  <fct>                      <dbl>                               <dbl>                             <dbl>     <int> <chr>  
1 3334002                     518                                 217                               94           1 a      
2 3239001                     467                                 248                              101           1 b      
3 3273001                     345                                 181                               84           2 c      
4 3810001                     335                                 106                               51           2 d      
5 3334001                     333                                 218                              193           3 e      
6 2314001                     224                                 150                              141           3 f      
7 NA                          492.                                232.                              97.5         1 Average
8 NA                          340                                 144.                              67.5         2 Average
9 NA                          278.                                184                              167           3 Average