3
votes

I'm having some trouble implementing a rolling join using data.table.

I have two tables

m<-data.table(PolicyNo=rep(1709119, 4), EFDT=as.Date(c("2013-02-01","2014-02-01", "2012-3-1", "2011-3-1")))
setkey(m,"PolicyNo","EFDT")
m
   PolicyNo       EFDT
1:  1709119 2011-03-01
2:  1709119 2012-03-01
3:  1709119 2013-02-01
4:  1709119 2014-02-01

p<-data.table(PolicyNo=rep(1709119, 2), EFDT=as.Date(c("2013-02-01", "2012-1-1")))
setkey(p,"PolicyNo","EFDT")
p
   PolicyNo       EFDT
1:  1709119 2012-01-01
2:  1709119 2013-02-01

The EFDTs in p are the "correct" EFDTs (i.e. the dates I want to keep). I'd like to join m with p, based on matching policy numbers and roll the EFDTs in m backwards to their first matching EFDT in p. In this example the result should be

result
   PolicyNo       EFDT     EFDT.m
1:  1709119 2012-01-01 2012-03-01
2:  1709119 2013-02-01 2013-02-01
3:  1709119 2013-02-01 2014-02-01

I tried

m[, EFDT.m := EFDT]
m[p, roll=-Inf, nomatch=0]

with no luck.

1
why does your desired result have 3 rows instead of 2 (the number of rows in p)?eddi
@eddi rows 3 and 4 in m should match row 2 in p because they have matching policy numbers and because their EFDTs (in m) are closest to 2013-02-01 when rolling backwards. Simple example - suppose you have a table of marketing campaigns and a table of product sales. In this case, you'd want to match each product sale with each marketing campaign based on the date the sale occurred, rolling backwards to the most recent marketing campaign. Each campaign can be tied to more than one sale.Ben

1 Answers

2
votes

Add a new column to p instead of m, since you want the opposite join:

p[, EFDT.copy := EFDT]

p[m, roll = Inf]
#   PolicyNo       EFDT  EFDT.copy
#1:  1709119 2011-03-01       <NA>
#2:  1709119 2012-03-01 2012-01-01
#3:  1709119 2013-02-01 2013-02-01
#4:  1709119 2014-02-01 2013-02-01

Add some filtering out of NA's and renaming of columns and you'll be all set.