0
votes

Suppose I have a nested iris dataset based on the "Species" column, how can I apply purr::map on this nested data to:

  1. filter rows inside each "Species" based on (Sepal.Length>5) for example
  2. mutate a new column inside each "Species" which is the "sum" of Sepal.Length and Petal.Length

??

Thank you!!

2

2 Answers

0
votes

You can probably try this -

library(dplyr)
library(purrr)

iris %>%
  nest(data = -Species) %>%
  mutate(data = map(data, ~.x %>% 
                              filter(Sepal.Length>5) %>% 
                               mutate(sum = Sepal.Length + Petal.Length)))

#  Species    data             
#  <fct>      <list>           
#1 setosa     <tibble [22 × 5]>
#2 versicolor <tibble [47 × 5]>
#3 virginica  <tibble [49 × 5]>
0
votes

You can try this too. In many use cases of nested data using purrr::map(), we can do away with just {dplyr} in its newer version (>1.0.0).

library(dplyr) # version 1.0.6

iris %>% 
  nest_by(Species) %>% 
  mutate(data_new = list(
    data %>% 
      filter(Sepal.Length > 5) %>% 
      mutate(sum = sum(Sepal.Length))
  )) %>%
  ungroup()

# # A tibble: 3 x 3
#   Species                  data data_new         
#   <fct>      <list<tibble[,4]>> <list>           
# 1 setosa               [50 x 4] <tibble [22 x 5]>
# 2 versicolor           [50 x 4] <tibble [47 x 5]>
# 3 virginica            [50 x 4] <tibble [49 x 5]>