Add multiple columns to R data.table in one function call?

Question

I have a function that returns two values in a list. Both values need to be added to a data.table in two new columns. Evaluation of the function is costly, so I would like to avoid having to compute the function twice. Here's the example:

library(data.table)
example(data.table)
DT
   x y  v
1: a 1 42
2: a 3 42
3: a 6 42
4: b 1  4
5: b 3  5
6: b 6  6
7: c 1  7
8: c 3  8
9: c 6  9

Here's an example of my function. Remember I said it's costly compute, on top of that there is no way to deduce one return value from the other given values (as in the example below):

myfun <- function (y, v) 
{
ret1 = y + v
ret2 = y - v
return(list(r1 = ret1, r2 = ret2))
}

Here's my way to add two columns in one statement. That one needs to call myfun twice, however:

DT[,new1:=myfun(y,v)$r1][,new2:=myfun(y,v)$r2]

   x y  v new1 new2
1: a 1 42   43  -41
2: a 3 42   45  -39
3: a 6 42   48  -36
4: b 1  4    5   -3
5: b 3  5    8   -2
6: b 6  6   12    0
7: c 1  7    8   -6
8: c 3  8   11   -5
9: c 6  9   15   -3

Any suggestions on how to do this? I could save r2 in a separate environment each time I call myfun, I just need a way to add two columns by reference at a time.

Why not have your function take in a data frame and return a data frame directly? `myfun <- function (y, v) { ret1 = y + v ret2 = y - v return(list(r1 = ret1, r2 = ret2)) } — Etienne Low-Décarie
@Etienne Because that copies the inputs to create a new output. Florian is using data.table for its memory efficiency with large datasets; it doesn't copy x,y or v at all, even once. Think 20GB datasets in RAM. — Matt Dowle

flodel flodel · Accepted Answer · 2012-07-03T10:25:00

Since data.table v1.8.3, you can do this:

DT[, c("new1","new2") := myfun(y,v)]

Another option is storing the output of the function and adding the columns one-by-one:

z <- myfun(DT$y,DT$v)
head(DT[,new1:=z$r1][,new2:=z$r2])
#      x y  v new1 new2
# [1,] a 1 42   43  -41
# [2,] a 3 42   45  -39
# [3,] a 6 42   48  -36
# [4,] b 1  4    5   -3
# [5,] b 3  5    8   -2
# [6,] b 6  6   12    0

Add multiple columns to R data.table in one function call?

5 Answers