3
votes

I have recently been working on a few algorithms and have gotten stuck on the following problem:

Given an undirected unweighted possibly disconnected graph G(V, E) find k partitions of G such that each partition has the same number of vertices (assume the total number of vertices is a multiple of k) and that there is a minimum number of edges in G that connect vertices in different partitions.

I am not looking for a polynomial time algorithm for this problem (as it is probably NP-Hard anyway), but rather an algorithm that will find a solution for 150 vertices and 1200 edges in under 24 hours. While any good approximation algorithms would be much appreciated as well, I would prefer an exact solution.

I kept the problem as simple as possible, however a general solution for a directed weighted graph would be nice as well.

Thanks for any and all help!

Update: I just did some more research and realized this can be reinterpreted as a modified connectivity problem. Maybe there is a solution along that line of thinking?

2

2 Answers

1
votes

This is indeed an NP-hard problem. If you're handy with convex optimization or can learn, my instinct is to formulate this problem as an exact cover integer program with many variables (one per subset of |V|/k vertices) and then actually solve that program by generating columns with an integer program solver to find partitions with many interior edges. The subproblem formulation will look like

maximize sum_{vertices v} w_v x_v + sum_{edges uv} y_{uv}
subject to
sum_{vertices v} x_v = |V|/k
y_{uv} <= x_u for all edges uv
y_{uv} <= x_v for all edges uv
x_v in {0, 1} for all vertices v
y_{uv} in {0, 1} for all edges uv

where w_v are weights determined by the dual solution to the master problem, intuitively understood as how urgently a particular vertex needs coverage.

0
votes

One approach would be to use clustering algorithms. You are not going to get the optimal solution, but a quick semi-optimal solution.

You could approach this in various ways. For example, you could use biclustering. Or you could do principle component analysis, pick a few components and run a modified k-means algorithm.