Problem with pipe within purrr:map2 and mutate

Question

nested_numeric <- model_table %>%
   group_by(ano_fiscal) %>%
   select(-c("ano_estudo", "payout", "div_ratio","ebitda", "name.company",
             "alavancagem","div_pl", "div_liq", "div_total")) %>%
   nest()

nested_numeric
# A tibble: 7 x 2
# Groups:   ano_fiscal [7]
  ano_fiscal data              
       <dbl> <list>            
1       2012 <tibble [34 x 10]>
2       2013 <tibble [35 x 10]>
3       2014 <tibble [35 x 10]>
4       2015 <tibble [35 x 10]>
5       2016 <tibble [35 x 10]>
6       2017 <tibble [35 x 10]>
7       2018 <tibble [35 x 10]>

df_ipca$idx
[1] 0.9652515 0.9741318 0.9817300 0.9911546 0.9941281 0.9985022 1.0000000

The list-column named "data" consists of numeric variables. I want to multiply them for a deflator index. (a.k.a. adjusting for inflation)

this works fine

map2_df(nested_numeric$data, df_ipca$idx, ~ .x * .y)

or even

map2(nested_numeric$data, df_ipca$idx, ~ .x * .y)

but I'm trying to create a new list-column named "adjusted_data" with the result of this operation:

nested_numeric <- model_table %>%
    group_by(ano_fiscal) %>%
    select(-c("ano_estudo", "payout", "div_ratio","ebitda", "name.company",
              "alavancagem","div_pl", "div_liq", "div_total")) %>%
    nest() %>%
    mutate( adjusted_data = data %>% {
    map2(., df_ipca$idx, ~ .x * .y)})

Gives me this error:

Error: Column `adjusted_data` must be length 1 (the group size), not 7

I hope my problem is clear enough because I'm trying to adjust for inflation a data frame with values nested by years. I thought that going for map2 within a mutate would be enough... I've tried everything and couldn't figure it what I'm doing wrong. I've read similar questions with pipes within map2 here, but still...

Please help :) Thank you!

Can you try mutate(adjuated_data = map2(data, df_ipca$idx, ~ .x * .y)) — akrun
it was not tested as there was no reproducible example. thanks — akrun

milanmlft milanmlft · Accepted Answer · 2020-05-25T07:31:43

A simple solution (which however does break up your pipes) is to just do

nested_numeric$adjusted_data <- map2(nested_numeric$data, df_ipca$idx, ~ .x * .y)

For example, using the iris data:

library(tidyverse)

df_ipca <- data.frame(idx = runif(3))

iris <- iris %>% 
  group_by(Species) %>% 
  nest()

iris$adjusted_data <- map2(iris$data, df_ipca$idx, ~.x * .y)
iris
#> # A tibble: 3 x 3
#> # Groups:   Species [3]
#>   Species    data              adjusted_data    
#>   <fct>      <list>            <list>           
#> 1 setosa     <tibble [50 × 4]> <df[,4] [50 × 4]>
#> 2 versicolor <tibble [50 × 4]> <df[,4] [50 × 4]>
#> 3 virginica  <tibble [50 × 4]> <df[,4] [50 × 4]>

Using solution with `mutate`

If you want to do the map2 inside mutate, after you have grouped and nested your data, you first have to ungroup() before calling mutate (I think otherwise mutate will try to do the operation within each group instead of looping over the entire data column, which is what you want):

nested_numeric %>%
  ungroup() %>%
  mutate(
    adjusted_data = map2(data, df_ipca$idx, ~ .x * .y)
  )

For example, using the iris data:

library(tidyverse)

df_ipca <- data.frame(idx = runif(3))

iris_nested <- iris %>% 
  group_by(Species) %>% 
  nest() %>% 
  ungroup() %>% 
  mutate(
    adjusted_data = map2(data, df_ipca$idx, ~ .x * .y)
  )

# Original data
map(iris_nested$data, head)
#> [[1]]
#> # A tibble: 6 x 4
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#>          <dbl>       <dbl>        <dbl>       <dbl>
#> 1          5.1         3.5          1.4         0.2
#> 2          4.9         3            1.4         0.2
#> 3          4.7         3.2          1.3         0.2
#> 4          4.6         3.1          1.5         0.2
#> 5          5           3.6          1.4         0.2
#> 6          5.4         3.9          1.7         0.4
#> 
#> [[2]]
#> # A tibble: 6 x 4
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#>          <dbl>       <dbl>        <dbl>       <dbl>
#> 1          7           3.2          4.7         1.4
#> 2          6.4         3.2          4.5         1.5
#> 3          6.9         3.1          4.9         1.5
#> 4          5.5         2.3          4           1.3
#> 5          6.5         2.8          4.6         1.5
#> 6          5.7         2.8          4.5         1.3
#> 
#> [[3]]
#> # A tibble: 6 x 4
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#>          <dbl>       <dbl>        <dbl>       <dbl>
#> 1          6.3         3.3          6           2.5
#> 2          5.8         2.7          5.1         1.9
#> 3          7.1         3            5.9         2.1
#> 4          6.3         2.9          5.6         1.8
#> 5          6.5         3            5.8         2.2
#> 6          7.6         3            6.6         2.1
# Adjusted data
map(iris_nested$adjusted_data, head)
#> [[1]]
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1    1.0206142   0.7004215    0.2801686  0.04002409
#> 2    0.9805901   0.6003613    0.2801686  0.04002409
#> 3    0.9405660   0.6403854    0.2601566  0.04002409
#> 4    0.9205540   0.6203733    0.3001807  0.04002409
#> 5    1.0006022   0.7204336    0.2801686  0.04002409
#> 6    1.0806503   0.7804697    0.3402047  0.08004817
#> 
#> [[2]]
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1    0.3256959   0.1488896    0.2186816  0.06513919
#> 2    0.2977791   0.1488896    0.2093760  0.06979199
#> 3    0.3210431   0.1442368    0.2279872  0.06979199
#> 4    0.2559039   0.1070144    0.1861120  0.06048639
#> 5    0.3024319   0.1302784    0.2140288  0.06979199
#> 6    0.2652095   0.1302784    0.2093760  0.06048639
#> 
#> [[3]]
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1     2.399749    1.257011     2.285475   0.9522814
#> 2     2.209293    1.028464     1.942654   0.7237339
#> 3     2.704479    1.142738     2.247384   0.7999164
#> 4     2.399749    1.104646     2.133110   0.6856426
#> 5     2.475932    1.142738     2.209293   0.8380076
#> 6     2.894935    1.142738     2.514023   0.7999164

In fact, you can also omit the group_by() and ungroup() calls by providing the non-nested column (in your case, ano_fiscal) to nest():

iris %>% 
  nest(data = -Species) %>% 
  mutate(
    adjusted_data = map2(data, df_ipca$idx, ~ .x * .y)
  )

which should give the same result as before. Note to avoid having a warning, you should name the -Species argument inside nest().

Problem with pipe within purrr:map2 and mutate

1 Answers

Using solution with mutate

Using solution with `mutate`