I have a function that returns two values in a list. Both values need to be added to a data.table in two new columns. Evaluation of the function is costly, so I would like to avoid having to compute the function twice. Here's the example:
library(data.table)
example(data.table)
DT
x y v
1: a 1 42
2: a 3 42
3: a 6 42
4: b 1 4
5: b 3 5
6: b 6 6
7: c 1 7
8: c 3 8
9: c 6 9
Here's an example of my function. Remember I said it's costly compute, on top of that there is no way to deduce one return value from the other given values (as in the example below):
myfun <- function (y, v)
{
ret1 = y + v
ret2 = y - v
return(list(r1 = ret1, r2 = ret2))
}
Here's my way to add two columns in one statement. That one needs to call myfun twice, however:
DT[,new1:=myfun(y,v)$r1][,new2:=myfun(y,v)$r2]
x y v new1 new2
1: a 1 42 43 -41
2: a 3 42 45 -39
3: a 6 42 48 -36
4: b 1 4 5 -3
5: b 3 5 8 -2
6: b 6 6 12 0
7: c 1 7 8 -6
8: c 3 8 11 -5
9: c 6 9 15 -3
Any suggestions on how to do this? I could save r2
in a separate environment each time I call myfun, I just need a way to add two columns by reference at a time.
data.table
for its memory efficiency with large datasets; it doesn't copyx
,y
orv
at all, even once. Think 20GB datasets in RAM. – Matt Dowle