0
votes

I have a complex function that returns multiple tibbles (or data frames) as a result of a few computations which are parametrized. These tibbles are shaped differently, so I cannot just return one tibble.

I want to be able to access the different result kinds for each parameter combination, so I create the parameter combinations and map them using pmap_dfr to get the results. This somewhat works, but this way, in my results, it's impossible to tell which kind of result I am looking at:

library(tidyverse)

foo <- function(.param1, .param2) {
  return(tibble(
    .param1 = .param1,
    .param2 = .param2,
    data = list(
      ret1 = tibble(ret1_col1 = c(1, 2, 3), ret1_col2 = c(1, 2, 3)),
      ret2 = tibble(ret2_col1 = c(1, 2, 3, 4, 5)),
      ret3 = tibble(ret3_col1 = c(1, 2), ret3_col2 = c(1, 2), ret3_col3 = c(1, 2))
    )
  ))
}

tibble::tribble(
  ~.param1, ~.param2,
  1, 2,
  3, 4
) %>% 
  pmap_dfr(foo)

#> # A tibble: 6 x 3
#>   .param1 .param2 data            
#>     <dbl>   <dbl> <list>          
#> 1       1       2 <tibble [3 × 2]>
#> 2       1       2 <tibble [5 × 1]>
#> 3       1       2 <tibble [2 × 3]>
#> 4       3       4 <tibble [3 × 2]>
#> 5       3       4 <tibble [5 × 1]>
#> 6       3       4 <tibble [2 × 3]>

Created on 2019-07-16 by the reprex package (v0.3.0)

For example, for the first row, which <tibble> is this referring to?

Ideally I'd get the following result:

  .param1 .param2             ret1             ret2             ret3
    <dbl>   <dbl>           <list>           <list>           <list>
1       1       2 <tibble [3 × 2]> <tibble [5 × 1]> <tibble [2 × 3]>
2       3       4 <tibble [3 × 2]> <tibble [5 × 1]> <tibble [2 × 3]>

How can I achieve this?

2
There isn't anything that would make sense as an ID column to show what's in row 1?camille
@camille I don't understand your question, sorry. What kind of ID column would have to be added, or how would that help? Are you suggesting that I create an ID column e.g. as the combination of param1 and param2?slhck
I was looking at your desired output for where you might be pulling an ID out of. I think now that was incorrect. But maybe something to reference the names of ret1, ret2, etc?camille

2 Answers

1
votes

If I'm understanding correctly, you can do minor modifications to your function to mark off which data frame you're creating. One way would be just making a column whose values have names matching the data frames, i.e. ret1, ret2.

library(tidyverse)

foo <- function(.param1, .param2) {
  dfs <- c("ret1", "ret2", "ret3") # added here
  return(tibble(
    .param1 = .param1,
    .param2 = .param2,
    col = dfs,                     # added here
    data = list(
      tibble(ret1_col1 = c(1, 2, 3), ret1_col2 = c(1, 2, 3)),
      tibble(ret2_col1 = c(1, 2, 3, 4, 5)),
      tibble(ret3_col1 = c(1, 2), ret3_col2 = c(1, 2), ret3_col3 = c(1, 2))
    ) %>%
      setNames(dfs)
  ))
}

Then you can use spread on that list column the same as you would on any other column.

tibble::tribble(
  ~.param1, ~.param2,
  1, 2,
  3, 4
) %>% 
  pmap_dfr(foo) %>%
  spread(key = col, value = data)
#> # A tibble: 2 x 5
#>   .param1 .param2 ret1             ret2             ret3            
#>     <dbl>   <dbl> <list>           <list>           <list>          
#> 1       1       2 <tibble [3 × 2]> <tibble [5 × 1]> <tibble [2 × 3]>
#> 2       3       4 <tibble [3 × 2]> <tibble [5 × 1]> <tibble [2 × 3]>
0
votes

One solution is to not return a list of tibbles, but each tibble in a list:

return(tibble(
    .param1 = .param1,
    .param2 = .param2,
    ret1 = list(tibble(ret1_col1 = c(1, 2, 3), ret1_col2 = c(1, 2, 3))),
    ret2 = list(tibble(ret2_col1 = c(1, 2, 3, 4, 5))),
    ret3 = list(tibble(ret3_col1 = c(1, 2), ret3_col2 = c(1, 2), ret3_col3 = c(1, 2)))
))

This way, the result can be correctly collected by pmap_dfr.

Full example:

library(tidyverse)

foo <- function(.param1, .param2) {
  return(tibble(
    .param1 = .param1,
    .param2 = .param2,
    ret1 = list(tibble(ret1_col1 = c(1, 2, 3), ret1_col2 = c(1, 2, 3))),
    ret2 = list(tibble(ret2_col1 = c(1, 2, 3, 4, 5))),
    ret3 = list(tibble(ret3_col1 = c(1, 2), ret3_col2 = c(1, 2), ret3_col3 = c(1, 2)))
  ))
}

tibble::tribble(
  ~.param1, ~.param2,
  1, 2,
  3, 4
) %>% 
  pmap_dfr(foo) %>% unnest(ret1)
#> # A tibble: 6 x 4
#>   .param1 .param2 ret1_col1 ret1_col2
#>     <dbl>   <dbl>     <dbl>     <dbl>
#> 1       1       2         1         1
#> 2       1       2         2         2
#> 3       1       2         3         3
#> 4       3       4         1         1
#> 5       3       4         2         2
#> 6       3       4         3         3

Created on 2019-07-16 by the reprex package (v0.3.0)