I'm struggling with multiple response questions in R. I'm hoping to find an easy way to tackle this with dplyr and tidyr. Below is a sample multiple respose data frame. I'm trying to do things,first, create percentages - % of cats,% of dogs, etc. Percentages will be of overall responses. My usual of calculating percentages -
group_by(_)%>%summarise(count=n())%>%mutate(percent=count/sum(count))
doesn't seem to cut it in this situation. Maybe I have to use summarise_each or a more specialized function? I'm still new to r and really new to Dplyr and Tidyr. I also tried to use Tidyr's "unite" function, which works, but it includes NA's, which I will have to recode away. But I still can't seem to calculate the percentages of the united column.
Any suggestions would be great! First, how to unite the multiple response columns using "unite" into all possible combinations and then calculating percentages of each, and also how to simply calculate the percentage of each binary column as a proportion of overall responses? Hope this makes sense! I'm sure there's a simple and elegant answer that I'm overlooking.
Cats<-c(Cat,NA,Cat,NA,NA,NA,Cat,NA)
Dogs<-c(NA,NA,Dog,Dog,NA,Dog,NA,Dog)
Fish<-c(NA,NA,Fish,NA,NA,NA,Fish,Fish)
Pets<-data.frame(Cats,Dogs,Fish)
Pets<-Pets%>%unite(Combined,Cats,Dogs,Fish,sep=",",remove=FALSE)
Animals%>%group_by(Combined)%>%summarise(count=n())%>%mutate(percent=count/sum(count))