I am teaching myself the R tidyverse purr()
package and am having trouble implementing map()
on a column of nested data frames. Could someone explain what I'm missing?
Using the base R ChickWeight dataset as an example I can easily get the number of observations for each timepoint under diet #1 if I first filter for diet #1 like so:
library(tidyverse)
ChickWeight %>%
filter(Diet == 1) %>%
group_by(Time) %>%
summarise(counts = n_distinct(Chick))
This is great but I would like to do it for each diet at once and I thought nesting the data and iterating over it with map()
would be a good approach.
This is what I did:
example <- ChickWeight %>%
nest(-Diet)
Implementing this map function then achieves what I'm aiming for:
map(example$data, ~ .x %>% group_by(Time) %>% summarise(counts = n_distinct(Chick)))
However when I try and implement this same command using a pipe to put it in another column of the original data frame it fails.
example %>%
mutate(counts = map(data, ~ .x %>% group_by(Time) %>% summarise(counts = n_distinct(Chick))))
Error in eval(substitute(expr), envir, enclos) :
variable 'Chick' not found
Why does this occur?
I also tried it on the data frame split into a list and it didn't work.
ChickWeight %>%
split(.$Diet) %>%
map(data, ~ .x %>% group_by(Time) %>% summarise(counts = n_distinct(Chick)))