how to avoid arranging in groupby colum in dplyr

Question

I have following dataframe in R

     Key       Quantity
     1_2013    20
     1_2013    20
     2_2013    20
     2_2013    30
     3_2013    20
     3_2013    20
     4_2013    20
     4_2013    30 
     10_2013   20
     10_2013   20
     11_2013   20
     11_2013   30

When I aggregate on Key column I want to keep the original order of Key column. But,when I do group_by in dplyr it gives me following order

     Key       Quantity
     1_2013    40
     10_2013   40
     11_2013   50
     2_2013    50
     3_2013    40
     4_2013    50

I want it in following order

     Key       Quantity
     1_2013    40
     2_2013    50
     3_2013    40
     4_2013    50
     10_2013   40
     11_2013   50

How can I do it in dplyr?

There are already nice solutions posted, but maybe using year_month instead of month_year would workaround the problem altogether. Y_m format, like Y_m_d, makes things easier also when naming/browsing files as they usually appear alphabetically sorted as well. — zeehio
Could you clarify what the actual intended output is, as it will affect what the "correct" approach might be to your problem. Do you want the output to be ordered in the same manner as your input (regardless of temporal ordering), or do you want the output to be ordered by month and year? — Benjamin

Uwe Uwe · Accepted Answer · 2018-07-26T11:00:09

The OP has requested When I aggregate on Key column I want to keep the original order of Key column.

forcats::fct_inorder()

The forcats package which is part of the tidyverse has the fct_inorder() which creates a factor where the factor levels are numbered in order of appearance:

library(tidyverse)
read_table(
"    Key       Quantity
     1_2013    20
     1_2013    20
     2_2013    20
     2_2013    30
     3_2013    20
     3_2013    20
     4_2013    20
     4_2013    30 
     10_2013   20
     10_2013   20
     11_2013   20
     11_2013   30"
) %>% 
  group_by(Key = fct_inorder(Key)) %>% 
  summarise(Quantity = sum(Quantity))

# A tibble: 6 x 2
  Key     Quantity
  <fct>      <int>
1 1_2013        40
2 2_2013        50
3 3_2013        40
4 4_2013        50
5 10_2013       40
6 11_2013       50

`data.table`

For the sake of completeness:
Although the OP has clearly asked for a dplyr solution I just want to mention that grouping with by = in data.table returns the groups in order of appearance by default. So, no factors are needed.

library(data.table)
fread(
  "    Key       Quantity
     1_2013    20
     1_2013    20
     2_2013    20
     2_2013    30
     3_2013    20
     3_2013    20
     4_2013    20
     4_2013    30 
     10_2013   20
     10_2013   20
     11_2013   20
     11_2013   30"
)[, .(Quantity = sum(Quantity)), by = Key]

       Key Quantity
1:  1_2013       40
2:  2_2013       50
3:  3_2013       40
4:  4_2013       50
5: 10_2013       40
6: 11_2013       50

how to avoid arranging in groupby colum in dplyr

3 Answers

forcats::fct_inorder()

`data.table`

standardize the character length

Make `Key` a factor

Bring in the old fashioned way of sorting

how to avoid arranging in groupby colum in dplyr

3 Answers

forcats::fct_inorder()

data.table

standardize the character length

Make Key a factor

Bring in the old fashioned way of sorting

`data.table`

Make `Key` a factor