0
votes

I am hoping someone can help me come up with a solution to my problem. I have a data frame that I'm looking to add 2 new sets of values based on the existing data frame.

  1. Weighted Average to come up with a composite weighted average by stat (2014 = .4, 2015 = .4, and 2016 = .2) multiplied by the grouped stat (HR, R, RBI, SB)
  2. 3 Year Avg (Same Idea as above, but just a straight 3 year mean of the 3 most current year)

I'd like the new data to be identified under the season column.

Here's the data.frame:

full_table_raw <- structure(list(playerID = c("abreujo02", "abreujo02", 
"abreujo02", "abreujo02", "abreujo02", "abreujo02", "abreujo02", 
"abreujo02", "abreujo02", "abreujo02", "abreujo02", "abreujo02", 
"arenano01", "arenano01", "arenano01", "arenano01", "arenano01", 
"arenano01", "arenano01", "arenano01", "arenano01", "arenano01", 
"arenano01", "arenano01", "blackch02", "blackch02", "blackch02", 
"blackch02", "blackch02", "blackch02", "blackch02", "blackch02", 
"blackch02", "blackch02", "blackch02", "blackch02"), season = c(2014L, 
2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 
2016L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 
2016L, 2016L, 2016L, 2016L, 2014L, 2014L, 2014L, 2014L, 2015L, 
2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2016L), stat = c("HR", 
"R", "RBI", "SB", "HR", "R", "RBI", "SB", "HR", "R", "RBI", "SB", 
"HR", "R", "RBI", "SB", "HR", "R", "RBI", "SB", "HR", "R", "RBI", 
"SB", "HR", "R", "RBI", "SB", "HR", "R", "RBI", "SB", "HR", "R", 
"RBI", "SB"), points = c(3, 2, 3, 2, 2, 1, 2, 1, 1, 1, 2, 1, 
1, 1, 1, 1, 3, 3, 3, 2, 3, 3, 3, 2, 2, 3, 2, 3, 1, 2, 1, 3, 2, 
2, 1, 3), ranks = c(1, 2, 1, 2, 2, 3, 2, 3, 3, 3, 2, 3, 3, 3, 
3, 3, 1, 1, 1, 2, 1, 1, 1, 2, 2, 1, 2, 1, 3, 2, 3, 1, 2, 2, 3, 
1), value = c(36, 80, 107, 3, 30, 88, 101, 0, 25, 67, 100, 0, 
18, 58, 61, 2, 42, 97, 130, 2, 41, 116, 133, 2, 19, 82, 72, 28, 
17, 93, 58, 43, 29, 111, 82, 17)), class = "data.frame", row.names = c(NA, 
-36L))
1

1 Answers

0
votes

You can do it this way:

full_table_raw %>% 
  # add a new column with the weights to apply
  mutate(weight = ifelse(season == 2016, .2, .4)) %>%
  # group_by, and then compute your averages
  group_by(stat) %>% 
  summarize(
    average = sum(value) / 3,
    weighted_average = sum(value * weight))

Which gives

# A tibble: 4 x 3
  stat  weighted_average average
  <chr>            <dbl>   <dbl>
1 HR                83.8    85.7
2 R                258     264  
3 RBI              275     281  
4 SB                35.0    32.3

If instead of computing a summary you want to add weighted_average and average as columns to your initial data frame, you can simply replace summarize by mutate

# A tibble: 36 x 9
# Groups:   stat [4]
   playerID  season stat  points ranks  value weight average weighted_average
   <chr>      <int> <chr>  <dbl> <dbl>  <dbl>  <dbl>   <dbl>            <dbl>
 1 abreujo02   2014 HR      3.00  1.00  36.0   0.400    85.7             83.8
 2 abreujo02   2014 R       2.00  2.00  80.0   0.400   264              258  
 3 abreujo02   2014 RBI     3.00  1.00 107     0.400   281              275  
 4 abreujo02   2014 SB      2.00  2.00   3.00  0.400    32.3             35.0
 5 abreujo02   2015 HR      2.00  2.00  30.0   0.400    85.7             83.8
 6 abreujo02   2015 R       1.00  3.00  88.0   0.400   264              258  
 7 abreujo02   2015 RBI     2.00  2.00 101     0.400   281              275  
...

Note that we can use a final %>% select(-weight) to remove the column weight that we added