1
votes

I'd like to create a nested list from this data frame:

dat <- data.frame(var1 = c("A", "A", "B", "B"),
                  var2 = c("A_1", "A_2", "B_1", "B_2"),
                  val = 1:4)

> dat
  var1 var2 val
1    A  A_1   1
2    A  A_2   2
3    B  B_1   3
4    B  B_2   4

I split the data frame by var1 first:

mylist <- split(dat, dat$var1)
> mylist
$A
  var1 var2 val
1    A  A_1   1
2    A  A_2   2

$B
  var1 var2 val
3    B  B_1   3
4    B  B_2   4

Now I want to create nested lists for var2, I tried:

mylist <- lapply(mylist, function(x) split(x, x$var2))

> mylist
$A
$A$A_1
  var1 var2 val
1    A  A_1   1

$A$A_2
  var1 var2 val
2    A  A_2   2

$A$B_1
[1] var1 var2 val 
<0 Zeilen> (oder row.names mit Länge 0)

$A$B_2
[1] var1 var2 val 
<0 Zeilen> (oder row.names mit Länge 0)


$B
$B$A_1
[1] var1 var2 val 
<0 Zeilen> (oder row.names mit Länge 0)

$B$A_2
[1] var1 var2 val 
<0 Zeilen> (oder row.names mit Länge 0)

$B$B_1
  var1 var2 val
3    B  B_1   3

$B$B_2
  var1 var2 val
4    B  B_2   4

But how can I avoid the empty data frames of inexistent combinations of var1 and var2 being created?

1

1 Answers

4
votes

Just wrap the second argument to split in droplevels. This will get rid of the now extraneous factor levels that were constructed in the original data.frame.

lapply(mylist, function(x) split(x, droplevels(x$var2)))
$A
$A$A_1
  var1 var2 val
1    A  A_1   1

$A$A_2
  var1 var2 val
2    A  A_2   2


$B
$B$B_1
  var1 var2 val
3    B  B_1   3

$B$B_2
  var1 var2 val
4    B  B_2   4