I have the following mock up table
#n a b group
1 1 1 1
2 1 2 1
3 2 2 1
4 2 3 1
5 3 4 2
6 3 5 2
7 4 5 2
I am using SAS for this problem. In column group, the rows that are interconnected through a and b are grouped. I will try to explain why these rows are in the same group
- row 1 to 2 are in group 2 since they both have a = 1
- row 3 is in group 2 since b = 2 in row 2 and 3 and row 2 is in group 1
- row 3 and 4 are in group 1 since a = 2 in both rows and row 3 is in group 1
The overall logic is that if a row x contains the same value of a or b as row y, row x also belongs to the same group as y is a part of. Following the same logic, row 5,6 and 7 are in group 2.
Is there any way to make an algorithm to find these groups?
a
andb
always increasing for each successive row? If yes, then Richard's answer will work, but if not then this is a much trickier problem that will involve making multiple passes through your data to identify connected components. – user667489