I have a tibble dataframe as follows:
data = tibble(X = c("a", "b", "c", "d","c"),
Y = c("a1", "b1", "c1", "d1","c1"),
Z = c("a2", "b2", "c2", "d2","c2"),
all = list(c("a"), c("b"), c("c", "c1"), c("d", "d2"), c("c", "b2")))
I want to create a new column "result" such that for each row :
- if the "Y" value is in "all" => result equals the Y value
- if the "Z" value is in "all" => result equals the Z value
- else result equals "none"
I've tried the following code with the dplyr syntax.
data %>%
mutate(result = case_when(Y %in% all ~ Y,
Z %in% all ~ Z,
TRUE ~ "none"))
It doesn't work as expected and returns:
# A tibble: 4 x 5
X Y Z all result
<chr> <chr> <chr> <list> <chr>
1 a a1 a2 <chr [2]> none
2 b b1 b2 <chr [1]> none
3 c c1 c2 <chr [2]> none
4 d d1 d2 <chr [2]> none
When I'd like to obtain :
# A tibble: 4 x 5
X Y Z all result
<chr> <chr> <chr> <list> <chr>
1 a a1 a2 <chr [2]> none
2 b b1 b2 <chr [1]> none
3 c c1 c2 <chr [2]> c1
4 d d1 d2 <chr [2]> d2
EDIT
One problem is to unlist the values in the column list as stated by Ronak Shah. But even with this solution, the behavior is like working on the column list will consider all the values of the column and not only those of a row.
Here is what I obtain with the proposed solution and the edited data :
data %>%
mutate(result = case_when(Y %in% flatten_chr(all) ~ Y,
Z %in% flatten_chr(all) ~ Z,
TRUE ~ "none"))
# A tibble: 5 x 5
X Y Z all result
<chr> <chr> <chr> <list> <chr>
1 a a1 a2 <chr [1]> none
2 b b1 b2 <chr [1]> b2
3 c c1 c2 <chr [2]> c1
4 d d1 d2 <chr [2]> d2
5 c c1 c2 <chr [2]> c1
When it should return :
# A tibble: 5 x 5
X Y Z all result
<chr> <chr> <chr> <list> <chr>
1 a a1 a2 <chr [1]> none
2 b b1 b2 <chr [1]> none
3 c c1 c2 <chr [2]> c1
4 d d1 d2 <chr [2]> d2
5 c c1 c2 <chr [2]> none
data %>% rowwise() %>% mutate(result = case_when(Y %in% unlist(all) ~ Y, Z %in% unlist(all) ~ Z,TRUE ~ "none"))
Should do what you want – A. Suliman