Trying to extend my own workflow (from columns) here: [1] tidyverse - delete a column within a nested column/list to filtering within a nested column/list, I found this potential solution: [2] Use filter() (and other dplyr functions) inside nested data frames with map()
My problem is that I want to filter in each "nest" on those rows that are not completely NA (i.e. I want to keep any row that has at least one non-missing value.
However, the simple solution in [2] doesn't work for me, probably because I want to filter on the sum of NA's per row, which might involve another map function within the filter.
(Note: I'm using the current github version of dplyr within tidyverse which offers some new experimental functions, like condense - which I'm using below, but I think that's not relevant for my problem/question).
I have the following data:
Data:
library(tidyverse)
library(corrr)
dat <- data.frame(grp = rep(1:4, each = 25),
Q1 = sample(c(1:5, NA), 100, replace = TRUE),
Q2 = sample(c(1:5, NA), 100, replace = TRUE),
Q3 = sample(c(1:5, NA), 100, replace = TRUE),
Q4 = sample(c(1:5, NA), 100, replace = TRUE),
Q5 = sample(c(NA), 100, replace = TRUE),
Q6 = sample(c(1:5, NA), 100, replace = TRUE))
I now calculate the correlations of Q1 to Q6 per group and delete the rowname
column.
cor_dat <- dat %>%
group_by(grp) %>%
condense(cor = correlate(cur_data()) %>%
select(-rowname)) %>%
ungroup()
But adding this line to my pipeline doesn't work:
cor_dat <- cor_dat %>%
mutate(cor = map(cor, ~ filter(., sum(is.na(.)) != ncol(.))))
I also tried, but this doesn't work either:
cor_dat <- cor_dat %>%
mutate(cor = map(cor, ~ filter(., !all(is.na(.)))))
Expected outcome in my data would be that the fifth row in each nest is filtered out.