0
votes

How do I count the number of mutated rows in dplyr?

Let say I'm doing a mutate operation on a column and conditionally changed the values in certain rows. In particular I used mutate(df,columnn = str_replace_all(column, "a", "A") from the stringr package to conditionally mutate some rows in a tibble.

Here's an example:

library(dplyr)
library(stringr)
library(ggplot2)
diamonds %>%
  mutate(cut = str_replace_all(cut, "a", "A"))

How could I extract the count/number of rows that have been mutated?

1

1 Answers

1
votes

You could compare the mutated rows with the originals. For example, add the following to the end of your chain and you'll get the number of altered rows:

... %>% { sum(.$cut != diamonds$cut) }
# [1] 23161

The full code would be

diamonds %>%
    mutate(cut = str_replace_all(cut, "a", "A")) %>%
    { sum(.$cut != diamonds$cut) }
# [1] 23161