4
votes

data.table does not retain key where doing a subset

setkey(DT,a,b,c)
key(DT[,list(a,b)]) # return NULL

Do anybody have a workaround this (where X Y hence Z are data.table) ? I actually want to do this =>

Z = X[,list(a, b, c)][Y, mult='last']

NOTE: I could do

X2 = X[,list(a, b, c)]
setattr(X2,"sorted",c("a","b","c"))
Z = X2[Y, mult='last']

but I do not want to copy X into X2

EDIT example:

Y = data.table(a=seq(2,4),key="a")
X = data.table(a=seq(1,5),b=seq(2,6),c=sample(letters,5),key="a,b,c")
X[,list(a, b, c)][Y, mult='last']
Error in `[.data.table`(X[, list(a, b, c)], Y, mult = "last") : 
When i is a data.table (or character vector), x must be keyed

UPDATE (eddi): As of version 1.8.11 this has been fixed and the key is retained in the first subset, so that the result is:

X[,list(a, b, c)][Y, mult='last']
#   a b c
#1: 2 3 k
#2: 3 4 z
#3: 4 5 u
1
Please provide an example to work with. - Arun
Well, I find it easier to work with the data. And I don't want to create one every time one comes up with a question (however simple it might be). I'll leave to someone else to answer. - Arun
@eddi: Nice! I had been doing this operation the with=FALSE way you mentioned below. Your update is a little cryptic insomuch as it requires reading the whole post; the result of key(DT[,list(a,b)]) would be clearer. Also, is a copy made (another of the OP's concerns)? I'd be surprised if it was possible to avoid making a copy... - Frank
@Frank thanks, good suggestion, edited. The same amount of copying I think happens which is column a, b and c get copied. - eddi

1 Answers

4
votes

Try this:

setkey(X[,list(a, b, c)])[Y, mult='last']

Alternatively, you can do X[Y] and then set all other columns from X to NULL

X[Y, mult="last"][, c(names_to_remove) := NULL]