2
votes

I am trying generate new columns in a tibble from the output of a function that takes as input several existing columns of that tibble plus user data. As a simplified example, I would want to use this function

addup <- function(x, y, z){x + y + z}

and use it to add the numbers in the existing columns in this tibble...

set.seed(1)
(tib <- tibble(num1 = sample(12), num2 = sample(12)))
# A tibble: 12 x 2
    num1  num2
   <int> <int>
 1     8     5
 2     6     3
 3     7     7
 4     3    11
 5     1     2
 6     2     1
 7    11     6
 8    10     9
 9     4     8
10     9    12
11     5    10
12    12     4

...together with user input. For instance, if a user defines the vector

vec <- c(3,6,4)

I would like to generate one new column per item in vec, adding the mapped values with the user input values.

The desired result in this case would look something like:

# A tibble: 12 x 5
    num1  num2   `3`   `6`   `4`
   <int> <int> <dbl> <dbl> <dbl>
 1     5     7    15    18    16
 2     8     2    13    16    14
 3     7     9    19    22    20
 4     1    11    15    18    16
 5     3     3     9    12    10
 6     9    12    24    27    25
 7     6     6    15    18    16
 8    10    10    23    26    24
 9    11     4    18    21    19
10    12     5    20    23    21
11     4     1     8    11     9
12     2     8    13    16    14

If I know vec beforehand, I could achieve this by

tib %>% 
  mutate("3" = map2_dbl(num1, num2, ~addup(.x, .y, 3)),
         "6" = map2_dbl(num1, num2, ~addup(.x, .y, 6)), 
         "4" = map2_dbl(num1, num2, ~addup(.x, .y, 4))) 

but as the length of vec can vary, I do not know how to generalize this. I've found this answer repeated mutate in tidyverse, but there the functions are repeated over the existing columns instead of using the multiple existing columns for mapping.

Any ideas?

2
I am not clear on the desired interface. do the existing columns need to be user specified? can there be different numbers of those too? does your equivalent of addup need to operate on single elements of the columns (hence map2) or does it take vectors as inputs?Calum You
the existing tibble with columns is already present and is not user-specified. The function I use is a custom plot function, that with each call uses the single values from the cols as input.MartijnVanAttekum

2 Answers

3
votes

Since we don't have to have the function or the colnames as arguments, this is relatively simple. You just need to iterate over vec with a function that returns the summed column, and then combine with the original table. If you have an addup function that accepts vector inputs then you can skip the whole map2 part; in fact this one does but I don't know if your real function does.

library(tidyverse)
vec <- c(3,6,4)
set.seed(1)
tib <- tibble(num1 = sample(12), num2 = sample(12))

addup <- function(c1, c2, z) {c1 + c2 + z}
addup_vec <- function(df, vec) {
  new_cols <- map_dfc(
    .x = vec,
    .f = function(v) {
      map2_dbl(
        .x = df[["num1"]],
        .y = df[["num2"]],
        .f = ~ addup(.x, .y, v)
      )
    }
  )
  colnames(new_cols) <- vec
  bind_cols(df, new_cols)
}

tib %>%
  addup_vec(vec)
#> # A tibble: 12 x 5
#>     num1  num2   `3`   `6`   `4`
#>    <int> <int> <dbl> <dbl> <dbl>
#>  1     4     9    16    19    17
#>  2     5     5    13    16    14
#>  3     6     8    17    20    18
#>  4     9    11    23    26    24
#>  5     2     6    11    14    12
#>  6     7     7    17    20    18
#>  7    10     3    16    19    17
#>  8    12     4    19    22    20
#>  9     3    12    18    21    19
#> 10     1     1     5     8     6
#> 11    11     2    16    19    17
#> 12     8    10    21    24    22

Created on 2019-01-16 by the reprex package (v0.2.0).

2
votes

This uses lapply to apply the function to each element of your vector then binds the result to the original data frame and adds column names.

# Given example
set.seed(1)
(tib <- tibble(num1 = sample(12), num2 = sample(12)))
addup <- function(x, y, z){x + y + z}
vec <- c(3,6,4)

# Add columns and bind to original data frame
foo <- cbind(tib, lapply(vec, function(x)addup(tib$num1, tib$num2, x)))

# Correct column names
colnames(foo)[(ncol(tib)+1):ncol(foo)] <- vec

# Print result
print(foo)

#    num1 num2  3  6  4
# 1     4    9 16 19 17
# 2     5    5 13 16 14
# 3     6    8 17 20 18
# 4     9   11 23 26 24
# 5     2    6 11 14 12
# 6     7    7 17 20 18
# 7    10    3 16 19 17
# 8    12    4 19 22 20
# 9     3   12 18 21 19
# 10    1    1  5  8  6
# 11   11    2 16 19 17
# 12    8   10 21 24 22