I have searched around but could not find a particular answer to my question.
Suppose I have a data frame df:
df = data.frame(id = c(10, 11, 12, 13, 14),
V1 = c('blue', 'blue', 'blue', NA, NA),
V2 = c('blue', 'yellow', NA, 'yellow', 'green'),
V3 = c('yellow', NA, NA, NA, 'blue'))
I want to use the values of V1-V3 as unique column headers and I want the occurrence frequency of each of those per row to populate the rows.
Desired output:
desired = data.frame(id = c(10, 11, 12, 13, 14),
blue = c(2, 1, 1, 0, 1),
yellow = c(1, 1, 0, 1, 0),
green = c(0, 0, 0, 0, 1))
There is probably a really cool way to do this with tidyr::spread and dplyr::summarise. However, I don't know how to spread the V* columns when the keys I want to spread by are all over the place in different columns and include NAs.
Thanks for any help!