subset data.table based on key being NOT an element of a list

Question

I have the following data.table:

DT = data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))

DT
   ID  A
1:  1 13
2:  2  1
3:  4 13
4:  5 11
5: 10 12

The contents of column A are not important. I have a list/vector test <- c(1, 5, 9, 10, 11, 12, ...) that can be many times longer than the data.table. I want to select the rows in the data.table DT such that the key ID is not present in the vector test:

    ID  A
2:  2  1
3:  4 13

I think that DT[!(ID %in% test)] works, but wanted to take advantage of the data.table fast key-based subsetting. Note that the vector test could possibly not have any elements in common with the key from DT, which would lead to the subset returning the data.table itself, and it could be that all keys are present in test, returning an empty data.table. Any suggestions?

Waldi Waldi · Accepted Answer · 2020-06-06T17:30:04

What about :

library(data.table)
DT   <- data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))
test <- data.table(ID = c(1, 5, 9, 10, 11, 12))
setkey(test,ID)
DT[!test, on="ID"]

subset data.table based on key being NOT an element of a list

2 Answers