Weighted mean calculation in R with missing values

Question

Does anyone know if it is possible to calculate a weighted mean in R when values are missing, and when values are missing, the weights for the existing values are scaled upward proportionately?

To convey this clearly, I created a hypothetical scenario. This describes the root of the question, where the scalar needs to be adjusted for each row, depending on which values are missing.

Image: Weighted Mean Calculation

File: Weighted Mean Calculation in Excel

It's definitely possible to do in R. Try having a go yourself and posting some example code here where you run into problems. — Scransom
Thanks qqq. There are many similar samples of code in related questions, link, but it seems like most want to mutate, or replace with the mean, or replace with zero, when there is an N/A. Without being a burden and asking the same question, I thought it might be easier to show the explicit difference with my case, where I want to re-scale the remaining variables. I hadn't seen that elsewhere. And it might just be an obvious, short answer, by using na.rm. — milaske

markdly markdly · Accepted Answer · 2017-10-01T22:58:39

Using weighted.mean from the base stats package with the argument na.rm = TRUE should get you the result you need. Here is a tidyverse way this could be done:

library(tidyverse)
scores <- tribble(
 ~student, ~test1, ~test2, ~test3,
   "Mark",     90,     91,     92,
   "Mike",     NA,     79,     98,
   "Nick",     81,     NA,     83)

weights <- tribble(
  ~test,   ~weight, 
  "test1",     0.2, 
  "test2",     0.4,
  "test3",     0.4)

scores %>% 
  gather(test, score, -student) %>%
  left_join(weights, by = "test") %>%
  group_by(student) %>%
  summarise(result = weighted.mean(score, weight, na.rm = TRUE))
#> # A tibble: 3 x 2
#>   student   result
#>     <chr>    <dbl>
#> 1    Mark 91.20000
#> 2    Mike 88.50000
#> 3    Nick 82.33333

Weighted mean calculation in R with missing values

2 Answers