0
votes
library(tidyverse)
diamonds %>% glimpse
Rows: 53,940
Columns: 10
$ carat   <dbl> 0.23, 0.21, 0.23, 0.29, 0.31, 0.24, 0.24, 0.26, 0.22, 0.23, 0.30, 0.23, 0.22, 0.31, 0.20, 0.32, 0.30, 0.30, 0.30, 0.30, 0.30, 0.23, 0.23, 0.31, 0.31, 0.…
$ cut     <ord> Ideal, Premium, Good, Premium, Good, Very Good, Very Good, Very Good, Fair, Very Good, Good, Ideal, Premium, Ideal, Premium, Premium, Ideal, Good, Good,…
$ color   <ord> E, E, E, I, J, J, I, H, E, H, J, J, F, J, E, E, I, J, J, J, I, E, H, J, J, G, I, J, D, F, F, F, E, E, D, F, E, H, D, I, I, J, D, D, H, F, H, H, E, H, F,…
$ clarity <ord> SI2, SI1, VS1, VS2, SI2, VVS2, VVS1, SI1, VS2, VS1, SI1, VS1, SI1, SI2, SI2, I1, SI2, SI1, SI1, SI1, SI2, VS2, VS1, SI1, SI1, VVS2, VS1, VS2, VS2, VS1, …
$ depth   <dbl> 61.5, 59.8, 56.9, 62.4, 63.3, 62.8, 62.3, 61.9, 65.1, 59.4, 64.0, 62.8, 60.4, 62.2, 60.2, 60.9, 62.0, 63.4, 63.8, 62.7, 63.3, 63.8, 61.0, 59.4, 58.1, 60…
$ table   <dbl> 55, 61, 65, 58, 58, 57, 57, 55, 61, 61, 55, 56, 61, 54, 62, 58, 54, 54, 56, 59, 56, 55, 57, 62, 62, 58, 57, 57, 61, 57, 57, 57, 59, 58, 58, 59, 59, 54, …
$ price   <int> 326, 326, 327, 334, 335, 336, 336, 337, 337, 338, 339, 340, 342, 344, 345, 345, 348, 351, 351, 351, 351, 352, 353, 353, 353, 354, 355, 357, 357, 357, 40…
$ x       <dbl> 3.95, 3.89, 4.05, 4.20, 4.34, 3.94, 3.95, 4.07, 3.87, 4.00, 4.25, 3.93, 3.88, 4.35, 3.79, 4.38, 4.31, 4.23, 4.23, 4.21, 4.26, 3.85, 3.94, 4.39, 4.44, 3.…
$ y       <dbl> 3.98, 3.84, 4.07, 4.23, 4.35, 3.96, 3.98, 4.11, 3.78, 4.05, 4.28, 3.90, 3.84, 4.37, 3.75, 4.42, 4.34, 4.29, 4.26, 4.27, 4.30, 3.92, 3.96, 4.43, 4.47, 4.…
$ z       <dbl> 2.43, 2.31, 2.31, 2.63, 2.75, 2.48, 2.47, 2.53, 2.49, 2.39, 2.73, 2.46, 2.33, 2.71, 2.27, 2.68, 2.68, 2.70, 2.71, 2.66, 2.71, 2.48, 2.41, 2.62, 2.59, 2.…

I would like to split diamonds into multiple dfs based on cut:

diamonds$cut %>% table()
.
     Fair      Good Very Good   Premium     Ideal 
     1610      4906     12082     13791     21551

So, there will be 5 data frames in the list of nrow 1610, 4906, 12082 etc, accessible by diamonds_split$Fair etc.

Tried:

diamonds_split <- diamonds %>% split(cut)
Error in unique.default(x, nmax = nmax) : 
  unique() applies only to vectors

Then tried:

diamonds_split <- diamonds %>% group_by(cut) %>% group_split

This runs but I don't understand what the resulting variable diamonds_split is. My rstudio environments pane calls it a Large vctrs list. What I'm seeking is a list of data frames accessible by names Fair, Good, Very Good etc.

How can I split diamonds up into multiple dataframes based on cut and then be able to access each piece with diamonds_split$[cut name] e.g. diamonds_split$Fair?

1

1 Answers

1
votes

A proper way to do what you want is creating a list using split() and saving the result in a new object. As a result you will have the desired dataframes:

#Data
Lcut <- split(diamonds,diamonds$cut)
#Explore
Lcut$Fair
#Dim
lapply(Lcut,dim)

Output:

$Fair
[1] 1610   10

$Good
[1] 4906   10

$`Very Good`
[1] 12082    10

$Premium
[1] 13791    10

$Ideal
[1] 21551    10