1
votes

I have a dataframe withs ports and n voyages:

library(dplyr)

ports <- c("Nantes", "Bordeaux", "Liverpool", "Bayonne", "Brest", "Bristol")
n <- c(47, 78, 45, 1, 1, 2)

ports_n <- data.frame(ports, n)

Here is my output :

      ports  n
 1    Nantes 47
 2  Bordeaux 78
 3 Liverpool 45
 4   Bayonne  1
 5     Brest  1
 6   Bristol  2

What I want : group all the values <= 2 in a group called "others" with dplyr package.

Expected output :

     ports  n
 1    Nantes 47
 2  Bordeaux 78
 3 Liverpool 45
 4    Others  4

What I tried :

top_ports <- ports_n %>%
 filter(n>1)

minor_ports <- ports_n %>%
filter(n <=2)
2

2 Answers

3
votes

You could change value in ports to 'others' where n <= 2 and then group by and sum.

library(dplyr)

ports_n %>%
  mutate(ports = replace(ports, n <= 2, 'others')) %>%
  group_by(ports) %>%
  summarise(n = sum(n))

# A tibble: 4 x 2
#  ports         n
#  <chr>     <dbl>
#1 Bordeaux     78
#2 Liverpool    45
#3 Nantes       47
#4 others        4

Or using the same logic in base R :

aggregate(n~ports, transform(ports_n, 
         ports = replace(ports, n <= 2, 'others')), sum)

data

Read data as characters.

ports_n <- data.frame(ports, n, stringsAsFactors = FALSE)
0
votes

Another dplyr option could be:

ports_n %>%
 filter(n > 2) %>%
 add_row(ports = "Others", n = sum(ports_n$n[ports_n$n <= 2]))

      ports  n
1    Nantes 47
2  Bordeaux 78
3 Liverpool 45
4    Others  4