What's the correct way to remove multiple columns from a data.table? I'm currently using the code below, but was getting unexpected behavior when I accidentally repeated one of the column names. I wasn't sure if this was a bug, or if I shouldn't be removing columns this way.
library(data.table)
DT <- data.table(x = letters, y = letters, z = letters)
DT[ ,c("x","y") := NULL]
names(DT)
[1] "z"
The above works fine, but
DT <- data.table(x = letters, y = letters, z = letters)
DT[ ,c("x","x") := NULL]
names(DT)
[1] "z"
:=
assignment in a call tounique()
(i.e. useDT[ ,unique(c("x","x")) := NULL]
) to be extra defensive. Since this seems like a data.table bug, I'd guess you'll only have to do that until Matthew Dowle moves that call tounique()
(or something equivalent to it) inside of the[.data.table()
– Josh O'BrienDT[, c(myCols):=NULL]
and that should do the trick. See rdatatable.gitlab.io/data.table/articles/… – Vince