1
votes

We have an algorithm problem in hand, can you please write your ideas about this, thank you!

There are N many nodes with K different colors. Some of the nodes have direct connection between each other and some do not. We want to select M nodes from these N total nodes, but these M nodes must be connected. Also, our selected group of M nodes must have minimum number of distinct colored neighbors. There might be more than one best combinations, finding any of them is the goal.

For example, we selected M nodes and in total, these M nodes have the following neighbors: 5 red, 3 blue, 1 green. In this case, we count the unique colors, so the number of distinct colored neighbors, in this case, is 3. We want to minimize this number by selecting the best possible combination of M nodes.

Example graph visualization :

graph

In this example, let's assume M = 4, then the best possible combination of nodes would be {9, 10, 11, 12} since this group has only one neighbor which is yellow. If we choose {0, 1, 3, 5}, the neighbors of this combination would be {2, 4, 6}, which consists of 2 red neighbors and 1 green neighbor; which results with score of 2 since we look for distinct number of colored neighbors.

Is this algorithm question NP-complete? How should we proceed? If this is not NP-complete, what is the best algorithm we can use to solve this problem? Can we combine graph algorithms such as Prim’s, Kruskal's, Floyd Warshall or traversal algorithms?

2
Since you do not have code, I'd say this question is a better fit on the math SE, the operations research SE, or the theoretical computer science SE. You might try your luck there!N. Wouda
If we can set K = N and M = N/log N, then it's NP-hard by reduction from the small-set expansion problem. It's obviously fixed-parameter tractable in M, and I think it might be fixed-parameter tractable in K (with a running time like O(2^K poly(N)). Depending on which regime we're in, I'd recommend either dynamic programming or an SMT solver.David Eisenstat

2 Answers

0
votes

If this is an NP problem, you could use ASP to solve it. Given instance.lp

color(0,yellow).
color(3,yellow).
color(8,yellow).
color(10,yellow).
color(2,green).
color(5,green).
color(12,green).
color(7,blue).
color(1,red).
color(4,red).
color(6,red).
color(9,red).
color(11,red).

edge(0,5).
edge(0,1).
edge(0,2).
edge(0,6).
edge(5,3).
edge(5,4).
edge(6,4).
edge(3,4).
edge(2,7).
edge(7,8).
edge(8,9).
edge(5,3).
edge(9,10).
edge(9,11).
edge(9,12).
edge(11,12).

edge(B,A) :- edge(A,B).

and encoding.lp

node(X) :- color(X,_).

%select exactly one start node
1 {start(X) : node(X)} 1.

% start node is in sub graph
sub(X) :- start(X).
% for any node in the sub graph you can add any connected node
{sub(Y) : edge(X,Y)} :- sub(X).

% it is wrong if we do not have exactly m nodes in the sub graph
:- not m = #sum {1,X: sub(X)}.

#minimize {1,C : sub(X), edge(X,Y), not sub(Y), color(Y,C)}.

#show sub/1.

The call clingo encoding.lp instance.lp --const m=4 gives you an optimal solution:

sub(3) sub(5) sub(4) sub(6)

The call clingo encoding.lp instance.lp --const m=4 --opt-mode=optN --project gives you all optimal solutions. The tools can be found at https://potassco.org/

0
votes

The subgraph { 4 3 5 6 } is also a good answer having one red neighbor.

I found this subgraph running a little program called subs.

C:\Users\James\code\subs>subs
l 0 1 1
l 0 6 1
l 0 5 1
l 0 2 1
l 2 7 1
l 3 5 1
l 3 4 1
l 4 6 1
l 4 5 1
l 7 8 1
l 8 9 1
l 9 10 1
l 9 11 1
l 9 12 1
l 11 12 1
n 0 yellow
n 1 red
n 2 green
n 3 yellow
n 4 red
n 5 green
n 6 red
n 7 blue
n 8 yellow
n 9 red
n 10 yellow
n 11 red
n 12 green
<-cPathFinder::read
n1 n0 n6 n5  colors 3
n1 n0 n6 n2  colors 3
n1 n0 n6 n4  colors 2
n1 n0 n5 n2  colors 3
n1 n0 n5 n3  colors 2
n1 n0 n5 n4  colors 3
n1 n0 n2 n7  colors 3
n0 n6 n5 n2  colors 3
n0 n6 n5 n3  colors 2
n0 n6 n5 n4  colors 3
n0 n6 n2 n7  colors 3
n0 n6 n2 n4  colors 4
n0 n6 n3 n4  colors 2
n0 n5 n2 n7  colors 2
n0 n5 n2 n3  colors 2
n0 n5 n2 n4  colors 3
n0 n5 n3 n4  colors 2
n0 n2 n7 n8  colors 2
n6 n5 n3 n4  colors 1
n2 n7 n8 n9  colors 3
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n0 n6 n5 n2  colors 3
n0 n6 n5 n3  colors 2
n0 n6 n5 n4  colors 3
n0 n6 n2 n7  colors 3
n0 n6 n2 n4  colors 4
n0 n6 n3 n4  colors 2
n0 n5 n2 n7  colors 2
n0 n5 n2 n3  colors 2
n0 n5 n2 n4  colors 3
n0 n5 n3 n4  colors 2
n0 n2 n7 n8  colors 2
n6 n5 n3 n4  colors 1
n2 n7 n8 n9  colors 3
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n6 n5 n3 n4  colors 1
n2 n7 n8 n9  colors 3
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n2 n7 n8 n9  colors 3
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n2 n7 n8 n9  colors 3
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n7 n8 n9 n10  colors 2
n7 n8 n9 n11  colors 2
n7 n8 n9 n12  colors 3
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n8 n9 n10 n11  colors 2
n8 n9 n10 n12  colors 2
n8 n9 n11 n12  colors 2
n9 n10 n11 n12  colors 1
n9 n10 n11 n12  colors 1

subgraph n6, n5, n3, n4,   has 1 differently colored neighbours

There is nothing clever about subs. It generates subgraphs and searches through them for the connected subgraph that has the fewest uniquely coloured neighbours. There are lots of nested loops, so it is not going to do well when there are thousands of nodes.

The code is at https://gist.github.com/JamesBremner/5a3965cc725c54dcaabd4beb44f9104f

An optimization has occured to me that should go a long way to handling large graphs. Assuming that the complete graph is connected ( every node can be reached from every other node ) then every subgraph must have at least one neighbour ( otherwise the nodes in the subgraph could not reach the nodes outside the subgraph ). So the best possible number of uniquely colored neighbours is 1 and no less. So the algorithm can stop as soon as the first subgraph with one uniquely colored neighbor has been found, since no better result can exist.