I use tidyverse in R-Studio and have a data frame (df), consisting of multiple variables and observations from patients.
There are columns, containing string variables of 7 different symptoms. These columns contain NAs as well. Some observations have multiple positive variables.
Here are the first 10 rows and 4 columns of the table:
symptom_1 symptom_2 symptom_3 symptom_4
1 <NA> <NA> SYMPTOM'S NAME SYMPTOM'S NAME
2 <NA> SYMPTOM'S NAME <NA> SYMPTOM'S NAME
3 <NA> <NA> <NA> <NA>
4 <NA> <NA> <NA> <NA>
5 <NA> <NA> <NA> <NA>
6 <NA> <NA> <NA> <NA>
7 <NA> <NA> <NA> <NA>
8 <NA> <NA> <NA> <NA>
9 <NA> <NA> <NA> <NA>
10 <NA> <NA> <NA> <NA>
I would like to build a new factor column, containing "Positive" for those observations which have at least 1 of the variables (symptoms), and "NA"s for those cases containing "NA"s for all symptoms. I.e. column should contain "Positive" for cases 1 and 2 and "NA" for cases from 3 to 10. I've searched for the solution in current resource, have tried different suggestions and the closest to my expectations came the result which looks as follows:
df<-
df %>%
select(symptom_1:symptom_7) %>%
mutate_if(is.character, funs(any_positive=ifelse(!is.na(.), "Positive", .)))
But this code resulted in 14 more columns, named as "symptom_1_any_positive", "symptom_2_any_positive", "symptom_3_any_positive" and so on, but not the single one. How can I solve this problem and mutate variables into only one column?
Thank you in advance.
grepl
andnames
won't work in this situation. – Jakhongir Alidjanov