Example data:
tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
value = c(1,2,3,4,4,6,2,5,3,6,3,2))
> tibbly
# A tibble: 12 x 4
age grouping1 grouping2 value
<dbl> <chr> <chr> <dbl>
1 10 A X 1
2 30 A X 2
3 50 A X 3
4 10 A Y 4
5 30 A Y 4
6 50 A Y 6
7 10 B X 2
8 30 B X 5
9 50 B X 3
10 10 B Y 6
11 30 B Y 3
12 50 B Y 2
Question: How to obtain the order of rows for each group in a dataframe? I can use dplyr to arrange the data in the an appropriate form to visualize what I am interested in:
> tibbly %>%
group_by(grouping1, grouping2) %>%
arrange(grouping1, grouping2, desc(value))
# A tibble: 12 x 4
# Groups: grouping1, grouping2 [4]
age grouping1 grouping2 value
<dbl> <chr> <chr> <dbl>
1 50 A X 3
2 30 A X 2
3 10 A X 1
4 50 A Y 6
5 10 A Y 4
6 30 A Y 4
7 30 B X 5
8 50 B X 3
9 10 B X 2
10 10 B Y 6
11 30 B Y 3
12 50 B Y 2
In the end I am interested in the order of the age column, for each group based on the value column. Is there a elegant way to do this with dplyr? Something like summarise()
based on the order of rows and not actual values
row_number
? As intibbly %>% group_by(grouping1, grouping2) %>% arrange(grouping1, grouping2, desc(value)) %>% mutate(RowNum=row_number())
– Kerry Jacksontibbly %>% group_by(grouping1, grouping2) %>% arrange(grouping1, grouping2, desc(value)) %>% summarise(order = paste0(age, collapse = ",")) %>% ungroup()
– AntoniosKarrange(grouping1, grouping2, desc(value))
can be used for visualisation purposes, butarrange(desc(value))
after grouping is enough to do the job you want. – AntoniosK