0
votes

I have a tibble, df, I would like to take the tibble and group it and then use dplyr::pull to create vectors from the grouped dataframe. I have provided a reprex below.

df is the base tibble. My desired output is reflected by df2. I just don't know how to get there programmatically. I have tried to use pull to achieve this output but pull did not seem to recognize the group_by function and instead created a vector out of the whole column. Is what I'm trying to achieve possible with dplyr or base r. Note - new_col is supposed to be a vector created from the name column.

library(tidyverse)
library(reprex)

df <- tibble(group = c(1,1,1,1,2,2,2,3,3,3,3,3),
             name = c('Jim','Deb','Bill','Ann','Joe','Jon','Jane','Jake','Sam','Gus','Trixy','Don'),
             type = c(1,2,3,4,3,2,1,2,3,1,4,5))

df
#> # A tibble: 12 x 3
#>    group name   type
#>    <dbl> <chr> <dbl>
#>  1     1 Jim       1
#>  2     1 Deb       2
#>  3     1 Bill      3
#>  4     1 Ann       4
#>  5     2 Joe       3
#>  6     2 Jon       2
#>  7     2 Jane      1
#>  8     3 Jake      2
#>  9     3 Sam       3
#> 10     3 Gus       1
#> 11     3 Trixy     4
#> 12     3 Don       5

# Desired Output - New Col is a column of vectors

df2 <- tibble(group=c(1,2,3),name=c("Jim","Jane","Gus"), type=c(1,1,1), new_col = c("'Jim','Deb','Bill','Ann'","'Joe','Jon','Jane'","'Jake','Sam','Gus','Trixy','Don'"))
df2
#> # A tibble: 3 x 4
#>   group name   type new_col                         
#>   <dbl> <chr> <dbl> <chr>                           
#> 1     1 Jim       1 'Jim','Deb','Bill','Ann'        
#> 2     2 Jane      1 'Joe','Jon','Jane'              
#> 3     3 Gus       1 'Jake','Sam','Gus','Trixy','Don'
Created on 2020-11-14 by the reprex package (v0.3.0)
1

1 Answers

1
votes

Maybe this is what you are looking for:

library(dplyr)

df <- tibble(group = c(1,1,1,1,2,2,2,3,3,3,3,3),
             name = c('Jim','Deb','Bill','Ann','Joe','Jon','Jane','Jake','Sam','Gus','Trixy','Don'),
             type = c(1,2,3,4,3,2,1,2,3,1,4,5))

df %>% 
  group_by(group) %>%
  mutate(new_col = name, name = first(name, order_by = type), type = first(type, order_by = type)) %>% 
  group_by(name, type, .add = TRUE) %>% 
  summarise(new_col = paste(new_col, collapse = ","))
#> `summarise()` regrouping output by 'group', 'name' (override with `.groups` argument)
#> # A tibble: 3 x 4
#> # Groups:   group, name [3]
#>   group name   type new_col               
#>   <dbl> <chr> <dbl> <chr>                 
#> 1     1 Jim       1 Jim,Deb,Bill,Ann      
#> 2     2 Jane      1 Joe,Jon,Jane          
#> 3     3 Gus       1 Jake,Sam,Gus,Trixy,Don

EDIT If new_col should be a list of vectors then you could do `summarise(new_col = list(c(new_col)))

df %>% 
  group_by(group) %>%
  mutate(new_col = name, name = first(name, order_by = type), type = first(type, order_by = type)) %>% 
  group_by(name, type, .add = TRUE) %>% 
  summarise(new_col = list(c(new_col)))

Another option would be to use tidyr::nest:

df %>% 
  group_by(group) %>%
  mutate(new_col = name, name = first(name, order_by = type), type = first(type, order_by = type)) %>% 
  nest(new_col = new_col)