4
votes

I would like to call functions by their character name on a data.table. Each function has also a vector of arguments (so there is a long list of functions to apply to data.table). Arguments are data.table columns. My first thought was that do.call would be a good approach for that task. Here is a simple example with one function name to run and it's vector of columns to pass:

# set up dummy data 
set.seed(1)
DT <- data.table(x = rep(c("a","b"),each=5), y = sample(10), z = sample(10))
# columns to use as function arguments
mycols <- c('y','z')
# function name 
func <- 'sum'
# my current solution:
DT[, do.call(func, list(get('y'), get('z'))), by = x]
#    x V1
# 1: a 47
# 2: b 63  

I am not satisfied with that since it requires to name specifically each column. And I would like to pass just a character vector mycols.

Other solution that works just as I need in this case is:

DT[, do.call(func, .SD), .SDcols = mycols, by = x]

But there is a hiccup with custom functions and the only solution that works for me is the first one:

#own dummy function    
myfunc <- function(arg1, arg2){
  arg1+arg2
}
func <- 'myfunc'
DT[, do.call(func, list(get('y'), get('z'))), by = x] 
#   x V1
#  1: a  6
#  2: a  6
#  3: a 11
#  4: a 17
#  5: a  7
#  6: b 15
#  7: b 17
#  8: b 10
#  9: b 11
# 10: b 10
# second solution does not work 
DT[, do.call(func, .SD), .SDcols = mycols, by = x]
# Error in myfunc(y = c(3L, 4L, 5L, 7L, 2L), z = c(3L, 2L, 6L, 10L, 5L)) : 
#  unused arguments (y = c(3, 4, 5, 7, 2), z = c(3, 2, 6, 10, 5))

As I understand it, it assumes that myfunc has arguments y, z which is not true. There should be variables y,z which should be passed to arguments arg1, arg2.

I also tried mget function, but also with no success:

DT[, do.call(func, mget(mycols)), by = x] 
# Error: value for ‘y’ not found

I could be missing something fairly obvious, thanks in advance for any guidance.

3
I would go with call or as.call instead of do.call, it needs to be wrapper into eval but gives some more flexibility. Something like eval(as.call(list(func, as.name("y"), as.name("z")))).jangorecki
@jangorecki Yes, this DT[, eval(call(func, as.name("y"), as.name("z"))), by = x] works for me as well. But I would like to use character vector mycols instead of naming explicitly y and z.wasyl
Are you looking for something like DT[, Reduce(func, mget(mycols)), by = x]?A5C1D2H2I1M1N2O1R2T1
@AnandaMahto That is very close to what I am looking for. For some reason if you define a custom function with more than 2 arguments your solution throws an error Error in f(init, x[[i]]) : argument "arg3" is missing, with no default. Any ideas?wasyl
Possible duplicate of Unused arguments in RAlex

3 Answers

2
votes

This is likely to be dependent on the types of functions you want to use, but it seems like Reduce might be of interest to you.

Here it is with both of your examples:

mycols <- c('y','z')
func <- 'sum'

DT[, Reduce(func, mget(mycols)), by = x]
#    x V1
# 1: a 47
# 2: b 63

myfunc <- function(arg1, arg2){
  arg1+arg2
}
func <- 'myfunc'

DT[, Reduce(func, mget(mycols)), by = x]
#     x V1
#  1: a  6
#  2: a  6
#  3: a 11
#  4: a 17
#  5: a  7
#  6: b 15
#  7: b 17
#  8: b 10
#  9: b 11
# 10: b 10
1
votes

Yes you are missing something (well, it's not really obvious, but careful debugging of the error identifies the problem). Your function expects named arguments arg1 and arg2. You are passing it arguments y = ... and z = ... via do.call (which you have noticed). The solution is to pass the list without names:

> DT[, do.call(func, unname(.SD[, mycols, with = F])), by = x]
    x V1
 1: a  6
 2: a  6
 3: a 11
 4: a 17
 5: a  7
 6: b 15
 7: b 17
 8: b 10
 9: b 11
10: b 10
0
votes

Here is a solution that helped me to achieve what I want.

func <- 'sum'
mycols <- c('y','z')
DT[, do.call(func, lapply(mycols, function(x) get(x))), by = x]
#    x V1
# 1: a 47
# 2: b 63

One can pass to it base functions or custom defined functions (not so specific as with Reduce solution).