I have following problem. I have list column of tibbles, which I want to clean up by just taking certain values and creating new tibble from them. Problem is that not all columns are available in original tibble, so I would like to replace them with NA. Example will clear this one.
DATA:
tbl <- tibble(list_col = list(tibble(id = c("A", "B", "C"),
val1 = c(1, 2, 3),
val2 = c(11, 22, 33)),
tibble(id = c("A", "B", "C"),
val1 = c(1, 2, 3))))
Real data is more complicated, but this will suffice to show problem. I have created function to pick columns I would like to have. I wrap everything now days with possibly
to avoid falling short.
get_res <- possibly(function(ls){
out = tibble(id = ls$id,
val1 = ls$val1,
val2 = ls$val2)
return(out)
}, otherwise = NA)
Then I usually map
function with mutate
to create clean column.
tbl %>%
mutate(clean = map(list_col, get_res))
# A tibble: 2 x 2
list_col clean
<list> <list>
1 <tibble [3 x 3]> <tibble [3 x 3]>
2 <tibble [3 x 2]> <lgl [1]>
Warning message:
Unknown or uninitialised column: 'val2'.
Expected result, but how can I go around this and replace val2 with NA in second tibble and return values what I have there?
I thought this would have worked:
get_res2 <- possibly(function(ls){
out = tibble(id = ls$id,
val1 = ls$val1,
val2 = tryCatch(ls$val2, error=function(e) NA))
return(out)
}, otherwise = NA)
But this will fail as well.
id
,val1
,val2
, can we assume that your actual case also has only few columns with known names? So that we don't need to look for all possible column names intbl
first. – Julius Vainora