Suppose I have a data frame:
df <- data.frame(SID=sample(1:4,15,replace=T), Var1=c(rep("A",5),rep("B",5),rep("C",5)), Var2=sample(2:4,15,replace=T))
which comes out to something like this:
SID Var1 Var2
1 4 A 2
2 3 A 2
3 4 A 3
4 3 A 3
5 1 A 4
6 1 B 2
7 3 B 2
8 4 B 4
9 4 B 4
10 3 B 2
11 2 C 2
12 2 C 2
13 4 C 4
14 2 C 4
15 3 C 3
What I hope to accomplish is to find the count of unique SIDs (see below under update, this should have said count of unique (SID, Var1) combinations) where the given row's Var1 is excluded from this count and the count is grouped on Var2. So for the example above, I would like to output:
SID Var1 Var2 Count.Excluding.Var1
1 4 A 2 3
2 3 A 2 3
3 4 A 3 1
4 3 A 3 1
5 1 A 4 3
6 1 B 2 3
7 3 B 2 3
8 4 B 4 3
9 4 B 4 3
10 3 B 2 3
11 2 C 2 4
12 2 C 2 4
13 4 C 4 2
14 2 C 4 2
15 3 C 3 2
For the 1st observation, we have a count of 3 because there are 3 unique combinations of (SID, Var1) for the given Var2 value (2, in this case) where Var1 != A (Var1 value of 1st observation) -- specifically, the count includes observation 6, 7 and 11, but not 12 because we already accounted for a (SID, Var1)=(2,C) and not row 2 because we do not want Var1 to be "A". All of these rows have the same Var2 value.
I'd preferably like to use dplyr functions and the %>% operator. &
UPDATE
I apologize for the confusion and my incorrect explanation above. I have corrected what I intended on asking for in the paranthesis, but I am leaving my original phrasing as well because majority of answers seem to interpret it this way.
As for the example, I apologize for not setting the seed. There seems to have been some confusion with regards to the Count.Excluding.Var1 for rows 11 and 12. With unique (SID, Var1) combinations, rows 11 and 12 should make sense as these count rows 1,2,6, and 7 xor 8.
Var2
in above logic? – MKRCount.Excluding.Var1
– DJack