I tried to create a reproducible example but, frustratingly this actually works:
my_mtcars <- mtcars %>%
rownames_to_column('car') %>%
group_by(vs) %>%
nest
my_mtcars <- my_mtcars %>%
mutate(lhs = map(.x = data, ~ .x %>% select(car:drat))) %>%
mutate(rhs = map(.x = data, ~ .x %>% select(car, wt:carb) %>% rename(model = car))) %>%
mutate(together_again = map2(.x = lhs, .y = rhs, ~ inner_join(.x, .y, by = c('car' = 'model'))))
The above works but shows in a nutshell what I'm trying to do with my real data. My actual data frame which includes list columns fails to mutate with an inner join and I'm hoping that by describing and showing some anonymised data here someone may be able to advise.
My data frame pdata
:
data
# A tibble: 104 x 7
MONETIZATION_WEEK_COHORT data cut_off clv_obj model prediction training_period_metrics
<date> <list> <int> <list> <list> <list> <list>
1 2020-03-30 <tibble [214,509 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
2 2020-03-30 <tibble [214,509 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
3 2020-04-06 <tibble [496,626 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
4 2020-04-06 <tibble [496,626 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
5 2020-04-13 <tibble [595,775 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
6 2020-04-13 <tibble [595,775 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
7 2020-04-20 <tibble [548,436 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
8 2020-04-20 <tibble [548,436 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
9 2020-04-27 <tibble [529,507 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
10 2020-04-27 <tibble [529,507 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
I'm trying to join prediction with training period metrics for each row. Here's what a sample of those two fields look like, they are both data frames:
The .y
field in map2 below:
pdata$prediction[[1]]$result %>% head(2) %>% glimpse
Rows: 2
Columns: 11
$ Id <chr> "123abc", "def456"
$ period.first <date> 2020-05-21, 2020-05-21
$ period.last <date> 2020-08-26, 2020-08-26
$ period.length <int> 14, 14
$ actual.x <int> 0, 0
$ actual.total.spending <dbl> 0, 0
$ PAlive <dbl> 0.72933712, 0.05683547
$ CET <dbl> 19.2692978, 0.1285307
$ DERT <dbl> 13.37550762, 0.08921192
$ predicted.mean.spending <dbl> 839.648, 1017.683
$ predicted.CLV <dbl> 11230.71800, 90.78944
The .x
field in map2 below:
pdata$training_period_metrics[[1]] %>% head(2) %>% glimpse
Rows: 2
Columns: 3
$ S <chr> "abc123", "def456"
$ Transactions <int> 40, 3
$ Total_Spending <dbl> 14660, 1797
I'm trying to join these into a data frame as a new column:
pdata %>% mutate(combined_data = map2(.x = training_period_metrics, .y = prediction, ~ inner_join(.x, .y$result, by = c('S' = 'Id'))))
Error: Problem with `mutate()` input `combined_data`.
x `x` and `y` must share the same src, set `copy` = TRUE (may be slow).
ℹ Input `combined_data` is `map2(...)`.
How can I join prediction$result
with training_period_metrics
within my purrr loop?
my_mtcars$rhs[[2]] <- NULL; my_mtcars %>% mutate(together_again = map2(.x = lhs, .y = rhs, ~ inner_join(.x, .y, by = c('car' = 'model'))))# Error: Problem with
mutate()` inputtogether_again
. ✖x
andy
must share the same src, setcopy
= TRUE (may be slow).` – akrun