I have two data.table
s:
library(data.table) # 1.9.6
DT1 <- data.table(id = rep(c("a", "b"), c(3, 2)),
x = c(6,12,15,37,41),
y = 1)
DT1[]
DT2 <- data.table(id = rep(c("a", "b"), c(4, 5)),
x = c(2,7,11,15,27,30, 35, 38,40))
DT2[]
I would like to do a rolling join with the tables and update DT2
by reference with the value of y
from DT1
. This almost does what I want:
DT2[DT1, y := y, on = c("id", "x"), roll = -Inf]
DT2[]
However, I noticed that after this operation
DT2[4]
has y = 1
, whereas, if I do the rolling join either a) without row 2 of DT1
OR without row 3 of DT1
I always get the same result since in both cases, the y from DT1 is joined to row 4 in DT2
- example:
DT2[, y:= NULL]
DT2[DT1[!2], y := y, on = c("id", "x"), roll = -Inf]
DT2[]
DT2[, y:= NULL]
DT2[DT1[!3], y := y, on = c("id", "x"), roll = -Inf]
DT2[]
What I would like to achieve as a result is
DT2[4, y := 2][]
i.e. where the join leads to multiple values in a "cell", I want the total / aggregate value. Is this possible using data.table
's rolling join and update-by-reference functionality?