R dplyr mutate columns relative to other columns

Question

I have a dataframe:

df <- data.frame(x = 1:5, y = rep(1,5), z = 0:4, 
                 fx = NA_real_, fy = NA_real_, fz = NA_real_)
my_count_columns <- c("x", "y", "z")

I want to fill in information by mutating in place columns fx, fy, fz that represents the frequency of each count variable.

What is the cleanest way to do this in dplyr/tidyverse, assuming I don't know the column names ahead of time?

Expected output:

  x y z         fx  fy  fz
1 1 1 0 0.06666667 0.2 0.0
2 2 1 1 0.13333333 0.2 0.1
3 3 1 2 0.20000000 0.2 0.2
4 4 1 3 0.26666667 0.2 0.3
5 5 1 4 0.33333333 0.2 0.4

akrun akrun · Accepted Answer · 2021-05-06T22:28:01

In base R, this could be

df[paste0('f', my_count_columns)] <- lapply(my_count_columns, 
   function(x) sapply(df[[x]], function(y) 
       mean(y == df[setdiff(my_count_columns, x)])))

Or in tidyverse

library(dplyr)
library(purrr)
df %>%
    select(all_of(my_count_columns)) %>% 
    mutate(across(everything(), ~  map_dbl(., function(x)
      mean(x == df[setdiff(my_count_columns, cur_column())])), 
          .names = 'f{.col}'))

R dplyr mutate columns relative to other columns

2 Answers