I've run into a stumbling block with dplyr::mutate a few times in that I can't figure out how to create new columns based on a function (e.g., summing or anything else) that would create new columns based on all pairs of two input sets of columns. A partial demonstration is below:
#Input data
set.seed(100)
in_dat <- tibble(x1 = sample(x = c(1:10, NA_real_), size = 1000, replace = TRUE),
x2 = sample(x = c(1:10, NA_real_), size = 1000, replace = TRUE),
x3 = sample(x = c(1:10, NA_real_), size = 1000, replace = TRUE),
x4 = sample(x = c(1:10, NA_real_), size = 1000, replace = TRUE),
y1 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE),
y2 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE),
y3 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE),
y4 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE),
y5 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE),
y6 = sample(x = c(1, 0, NA_real_), size = 1000, replace = TRUE))
#Output data with 1 column pair; all pairs between x and y should be computed
out_dat_1col <- in_dat %>%
mutate(miss_x1y1 = if_else(is.na(x1) & is.na(y1), TRUE, FALSE))
This checks to see of pairs of x and y columns both have missing values and marks TRUE in the new column. This is only one pair though, and I'd like a way to do this for all pairs between x and y columns other than manually coding each of them in their own mutate line. I think purrr should be able to accomplish this, but I haven't figured out the proper syntax with the map variants or possibly reduce as well. I'm currently getting an error from both map2_dfc
(to append the new columns on to the existing columns with bind_cols
) and reduce2
that .x
(x variables) and .y
(y variables) are not of consistent length, and I'm not sure how to circumvent this. Any thoughts are much appreciated.
#Produces error
out_dat <- in_dat %>%
bind_cols(map2_dfc(
.x = in_dat %>% select(starts_with('x')),
.y = in_dat %>% select(starts_with('y')),
.f = ~if_else(is.na(.x) & is.na(.y), TRUE, FALSE)
))
Error: Mapped vectors must have consistent lengths:
* `.x` has length 4
* `.y` has length 6