I want to extract data.table columns if their contents fulfill a criteria. And I need a method that will work with by (or in some other way within combinations of columns). I am not very experienced with data.table and have tried my best with .SDcol and what else I could think of.
Example: I often have datasets with observations at multiple time points for multiple subjects. They also contain covariates which do not vary within subjects.
dt1 <- data.table(
id=c(1,1,2,2,3,3),
time=c(1,2,1,2,1,2),
meas=c(452,23,555,33,322,32),
age=c(30,30,54,54,20,20),
bw=c(75,75,81,81,69,70)
)
How do I (efficiently) select the columns that do not vary within id (in this case, id and age)? I'd like a function call that would return
id age
1: 1 30
2: 2 54
3: 3 20
And how do I select the columns that do vary within ID (so drop age)? The function call should return:
id time meas bw
1: 1 1 452 75
2: 1 2 23 75
3: 2 1 555 81
4: 2 2 33 81
5: 3 1 322 69
6: 3 2 32 70
Of course, I am interested if you know of a function that addresses the specific example above, but I am even more curious on how to do this generally. Columns that contain more than two values > 1000 within any combinations of id and time in by=.(id,time), or whatever...
Thanks!