I want to be able to apply operations to a data frame (tibble) column that contains S3 list-like objects, to act on one of the named items from each object in the column. As per the bottom of the question, I have this working using sapply()
within mutate()
, but that seems like it should be unnecessary.
Where information is stored in columns containing atomic data, dplyr functions like mutate()
work as expected. This works, for example:
library(dplyr)
people_cols <- tibble(name = c("Fiona Foo", "Barry Bar", "Basil Baz"),
height_mm = c(1750, 1700, 1800),
weight_kg = c(75, 73, 74)) %>%
mutate(height_inch = height_mm / 25.4)
people_cols
# # A tibble: 3 × 4
# name height_mm weight_kg height_inch
# <chr> <dbl> <dbl> <dbl>
# 1 Fiona Foo 1750 75 68.89764
# 2 Barry Bar 1700 73 66.92913
# 3 Basil Baz 1800 74 70.86614
But I want to work with data in S3 list objects. Here is a toy example:
person_stats <- function(name, height_mm, weight_kg) {
this_person <- structure(list(name = name,
height_mm = height_mm,
weight_kg = weight_kg),
class = "person_stats")
}
fiona <- person_stats("Fiona Foo", 1750, 75)
barry <- person_stats("Barry Bar", 1700, 73)
basil <- person_stats("Basil Baz", 1800, 74)
fiona$height_mm
# [1] 1750
I can put these objects into a tibble column like this:
people <- tibble(personstat = list(fiona, barry, basil))
people
# # A tibble: 3 × 1
# personstat
# <list>
# 1 <S3: person_stats>
# 2 <S3: person_stats>
# 3 <S3: person_stats>
But if I try to use mutate() on the column containing those objects I get errors:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_inch = personstat$height_mm / 25.4)
# Error in mutate_impl(.data, dots) : object 'personstat' not found
Trying to keep it as simple as possible - if I can even reference the named items on their own then I could at least get them into a new column, and then from that do whatever operations on them:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_mm = personstat$height_mm)
# Error in mutate_impl(.data, dots) :
# Unsupported type NILSXP for column "height_mm"
Note the different error, which is interesting - it's no longer complaining about finding the column, just struggling with the named item.
I can get it to work using base functions, cbind()
and sapply()
with [[
as the function:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
cbind(height_mm = sapply(.$personstat, '[[', name="height_mm"))
people
# personstat height_mm
# 1 Fiona Foo, 1750, 75 1750
# 2 Barry Bar, 1700, 73 1700
# 3 Basil Baz, 1800, 74 1800
Though that loses the tibble-iness.
class(people)
# [1] "data.frame"
And finally, that got me to this, which works, but it feels like using sapply()
sort of misses the point of dplyr mutate()
, which I think should work all the way down a column without needing that:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_mm = sapply(.$personstat, '[[', name="height_mm"))
people
# A tibble: 3 x 2
# personstat height_mm
# <list> <dbl>
# 1 <S3: person_stats> 1750
# 2 <S3: person_stats> 1700
# 3 <S3: person_stats> 1800
Is there any way of using mutate()
to get the output as above, without having to rely on something like sapply()
? Or, indeed, any other sensible ways of extracting named values from within list-like S3 objects stored in a column of a tibble?