I am trying to select information by different group in a data.frame (or data.table), but didn't find the proper way of doing it. Consider the following example:
DF <- data.table(value=c(seq(5,1,-1),c(5,5,3,2,1)),group=rep(c("A","B"),each=5),status=rep(c("D","A","A","A","A"),2))
value group status
1: 5 A D
2: 4 A A
3: 3 A A
4: 2 A A
5: 1 A A
6: 5 B D
7: 5 B A
8: 3 B A
9: 2 B A
10: 1 B A
I'd like now to get the max value by group when the status is alive ("A"). I have tried this:
DF[,.I[value==max(value[status!="D"])],by=group]
group V1
1: A 2
2: B 6
3: B 7
But the 6th row is status "D" (dead) and I'd like to avoid that row. I can't subset the data like this:
DF[status!="D",.I[value==max(value[status!="D"])],by=group]
as I need to compute different stats by groups, such as (doesn't work):
DF[,list("max"=max(value[status!="D"],na.rm=T),"group"=group[.I[value==max(value[status=="D"],na.rm=T)]]),by=group]]
Any hint would be greatly appreciated!
DF[status != "D", .I[value == max(value)], by = group]$V1#[1] 2 7
6 is not there – akrun