0
votes

I'm currently trying to get counts of observations that meet multiple criteria using dplyr, and group by city. For example:

datacount.by.city <- data %>% 
group_by(city) %>% 
filter(cond1 == TRUE | cond2 == TRUE) %>% 
tally()

I'm appending this condition to existing dataframe with a greater number of cities than is contained in these data. Is there a way to group_by(city) in this code while adding NA values for cities that are in my main dataframe but not in the data that I'm working on, so I can easily cbind to it and have the right number of columns in the right place?

1
Please make this more reproducible by including example input data and the expected output.neilfws
dplyr's bind_rows() and bind_cols() don't need your data to be orderedElio Diaz
City Cond1 Cond2 City1 TRUE FALSE City2 TRUE FALSE City2 FALSE TRUE City1 FALSE TRUE Turns to City count City1 2 City2 2 City3 0Conor
Possible duplicate of left_join(x,y) and NAtjebo
Please have a thorough look at left_join. I personally really like the rstudio.com/wp-content/uploads/2015/02/…tjebo

1 Answers

0
votes

If your full dataset were df for instance:

data <- df %>%
    subset(city=="A")

datacount.by.city <- data %>% 
   ddply(.(city),summarize,count=sum(cond1 == TRUE | cond2 == TRUE)) %>%
   right_join(df,by="city")

gives:

     city count cond1 cond2
1    A     1  TRUE  TRUE
2    B    NA  TRUE  TRUE
3    C    NA  TRUE  TRUE