this question might have been asked previously but I'm looking for a data.table solution if possible without using other packages. Ive got a data.table DT1 as a reference:
> require(data.table)
> DT1 <- data.table(col1 = c("AA", "BA", "ABC", "ABC BC", "AB")
, col2 = c(1,4,5,3,2))
> DT1
col1 col2
1: AA 1
2: BA 4
3: ABC 5
4: ABC BC 3
5: AB 2
and I would like to merge a second data.table DT2 with DT1 based on partial matching of col1 in DT1 and col2 in DT2, creating a col3 in DT2.
> DT2 <- data.table(col1 = c(0,5,2,7,1,0)
, col2 = c("BA", "ABC", "DC", "AA", "AB", "R AB"))
> DT2
col1 col2
1: 0 BA
2: 5 ABC
3: 2 DC
4: 7 AA
5: 1 AB
6: 0 R AB
desired output
> desired_output <- data.table(col1 = c(0,5,5,2,7,1,1,1,0)
, col2 = c("BA", "ABC", "ABC", "DC", "AA", "AB", "AB", "AB", "R AB")
, col3 = c(4,5,3,NA,1,5,3,2,2))
> desired_output
col1 col2 col3
1: 0 BA 4
2: 5 ABC 5
3: 5 ABC 3
4: 2 DC NA
5: 7 AA 1
6: 1 AB 5
7: 1 AB 3
8: 1 AB 2
9: 0 R AB 2
Is there any graceful way to do this using data.table operations? if not then happy to consider other solutions. this is going to be run on a very large dataset.
Edit: to specify the conditions of partial matching, it's a match if either string of col1 in DT1 is a subset of the string of col2 in DT2 or vice versa (the string of col2 in DT2 is a subset of the string of col1 in DT1). A two way grepl?
col1/DT1 col2/DT2
"AB" "There is ABhere" # it's a match
"ABC" "someABC" # it's a match
"ABC BC" "ABC" # it's a reverse match
"DR" "ADD" # no match
"BA" "HABAHA" # two matches