1
votes

I have a data frame with three columns: State1, State2, State3. Is there a way to get the counts of each state in one dataframe, using all three columns (preferably with dplyr and without an explicit loop)? I only figured out how to do one column:

df %>% group_by(State1) %>% summarise(n=sum(!is.na(State1)))
1
Is it all grouped by 'State1' or individual groupings. If it is the latter, then you may need a loop. Please provide a small reproducible example and expected ooutput - akrun
My attempt above is grouped by State1, but I wouldn't say the solution necessarily has to be grouped like that. The factors are the same across all three columns though (50 states + territories). - mp3242
Something like gather(df) %>% group_by(key, val) %>% summarise(n = sum(!is.na(val))) ?\ - akrun
Yup if you expliclty want the non na sum, you should follow @akruns advice. My approach should just add an extra row that counts the NA entries which may be helpful - Ajjit Narayanan
@AjjitNarayanan I think it is better to mention in your post because the question is confusing based on the code - akrun

1 Answers

1
votes

You're close. You should gather all your columns into one column first, then group_by and summarize.

df %>%
    gather("key", "value", state1, state2, state3) %>% 
    group_by(value) %>%
    summarise(n=n())

Note: This also counts the number of NA entries if you have any.