I have a data.table, let's say test.dt, with a column called "id". The rows of column "id" refer to the column headers of a second data.table, say counts.dt. I would like to provide the length of each column in counts to match up with the corresponding row id in the original test.dt. For example:
test <- function() {
library(data.table)
test.dt <- data.table(id=c("a","b","c"),other=1:3)
counts.dt <- data.table(a=c(1,NA,NA,NA),b=c(1,1,NA,NA),c=c(1,1,1,1),d=1:4,e=1:4)
print(counts.dt)
test.dt<-test.dt[,count:=sum(!is.na(counts.dt[,id]))]
print(test.dt)
}
Which returns: counts.dt as expected:
a b c d e
1: 1 1 1 1 1
2: NA 1 1 2 2
3: NA NA 1 3 3
4: NA NA 1 4 4
However, test.dt seems to count not the number of non-NA elements in the columns of counts.dt, but the length of test.dt resulting in:
id other count
1: a 1 3
2: b 2 3
3: c 3 3
What I would like, is a table like:
id other count
1: a 1 1
2: b 2 2
3: c 3 4
Thoughts?
I have tried making my "sum" statement more complex with different eval functions with the same results. I haven't been able to find an answer to this particular question; any help or redirection to a similar question would be greatly appreciated.
UPDATE: My actual data had more longer text strings as IDs, using the answer as shown resulted in the following error:
Error in Math.factor(j) : abs not meaningful for factors
However, I was able to get things moving with the following:
get.length<-function(x){return(as.character(x))}
test.dt<-test.dt[,count:= sum(!is.na(counts.dt[,get.length(id),with=FALSE]),na.rm=TRUE),by=id]