1
votes

Let's assume I have a data frame with lots of columns: var1, ..., var100, and also a matching named vectors of the same length.
I would like to create a function that if in the data frame there are NA's it would pick the data from the named vector. This is what I wrote so far:

data %>% 
  mutate(var1 = ifelse(is.na(var1), named_vec["var1"], var1),
         var2 = ifelse(is.na(var2), named_vec["var2"], var2),
         ...)

It works, however if I have 100's variable it would be very impractical to write so many conditions. I then tried this:

data %>% 
   mutate_if(~ifelse(is.na(.x), named_vec[colnames(.x)], .x))

Error in selected[[i]] <- eval_tidy(.p(column, ...)) : 
  more elements supplied than there are to replace

However this does not work. Is there a way in dplyr to extract the column name do I can slice the named vector?

Here a small example of data to try

data <- data.frame(var1 = c(1, 1, NA, 1),
                   var2 = c(2, NA, NA, 2),
                   var3 = c(3, 3, 3, NA))

named_vec <- c("var1" = 1, "var2" = 2, "var3" = 3)
1
the outcome of your not working code would also be helpfulPablo Herreros Cantis

1 Answers

2
votes

It may be easier to do this with coalesce

library(dplyr)
library(purrr)
library(stringr)
nm1 <- str_c('var', 1:3)
data[nm1] <- map_dfc(nm1, ~ coalesce(data[[.x]], named_vec[.x]))
data
#  var1 var2 var3
#1    1    2    3
#2    1    2    3
#3    1    2    3
#4    1    2    3

Or if we replicate the 'named_vec',

data[] <-  coalesce(as.matrix(data), named_vec[col(data)])

Another option is to convert to 'long' format, then do a left_join, coalesce the 'value' columns, and reshape back to 'wide' format

library(tidyr)
data %>%
   mutate(rn = row_number()) %>%
   pivot_longer(cols = -rn) %>% 
   left_join(enframe(named_vec), by = 'name') %>%
   transmute(rn, name, value = coalesce(value.x, value.y)) %>% 
   pivot_wider(names_from = name, values_from = value) %>% 
   select(-rn)