I'm faced with (what I think) is a tough problem with aggregations on data.table
I've the following data.table
structure(list(id1 = c("a", "a", "a", "b", "b", "c", "c"), id2 = c("x",
"y", "z", "x", "u", "y", "z"), val = c(2, 1, 2, 1, 3, 4, 3)), .Names = c("id1",
"id2", "val"), row.names = c(NA, -7L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x1f66a78>)
I would like to create conditional aggregates on the val column for this data based on the second column id2. The way the aggregation is done is to only include id1 groups which have at least one element from a given id2 element. I'll step through an example to show what I mean.
The conditional aggregate for x (the first row 2nd column) would include val values 2,1,2 for id1 = a and val values = 1,3 from id1 = b because id2=x exists for them but no values from id1=c, resulting in a value of 2 + 1 + 2 + 1 + 3 = 9. I want the 9 as a 4th column in every row where id2 = x appears.
Likewise, I want to do this for all id2 values. So the final output would be
id1 id2 val c.sum
1: a x 2 9
2: a y 1 12
3: a z 2 12
4: b x 1 9
5: b u 3 4
6: c y 4 12
7: c z 3 14
Is this possible in R, data.table? Or any other package/method? Thanks in advance