There are no recent answers to this question using the current tidyverse verbs (R 4.1 & tidyverse 1.3.1 in my case). I've tried using mutate with both case_when() and ifelse() with select_if() to conditionally fill a new variable with a value calculated from the number of TRUE values in specific other columns by row but neither seem to be filtering the correct columns to calculate from, as intended. I could probably pivot longer to replace my column groupings and avoid the need to filter which columns are used in the mutate calculation but I want to keep one response per row for merging later. Here's a reproducible example.
library(tidyverse)
set.seed(195)
# create dataframe
response_id <- rep(1:461)
questions <- c("overall","drought","domestic","livestock","distance")
answers <- c("a","b","c","d","e")
colnames <- apply(expand.grid(questions, answers), 1, paste, collapse="_")
df <- tibble(response_id)
# data is actually an unknown mix of TRUE and FALSE values in all columns but just doing that for one column for now for simplicity
df[,colnames] = FALSE
df$overall_a[sample(nrow(df),100)] <- TRUE
# using ifelse and select if to filter which columns to sum
df %>%
mutate(positive = ifelse(select_if(isTRUE), sum(str_detect(colnames(df), "a|b")), NA)) %>%
mutate(negative = ifelse(select_if(isTRUE), sum(str_detect(colnames(df), "c|d|e")), NA)) %>%
select(response_id, positive, negative)
# using case_when
df %>%
mutate(positive = case_when(TRUE ~ sum(str_detect(colnames(df), "a|b"))), NA) %>%
mutate(negative = case_when(TRUE ~ sum(str_detect(colnames(df), "c|d|e"))), NA) %>%
select(response_id, positive, negative)
The desired output should be as follows. Thanks for any help on this!
# A tibble: 461 × 3
response_id positive negative
<int> <int> <int>
1 1 0 0
2 2 0 0
3 3 0 0
4 4 0 0
5 5 1 0
6 6 1 0
7 7 0 0
8 8 1 0
9 9 0 0
10 10 1 0
# … with 451 more rows