3
votes

When given groups of numbers, how can I find the minimal set of groups to cover all the numbers? The constraint is that the chosen groups should not overlap.

For example, given three groups of numbers (1,2), (2,3), and (3,4), we can select (1,2) and (3,4) as (2,3) is redundant.

With (1,2),(2,3),(3,4),(1,4), we have two solutions (1,2),(3,4) or (1,4),(2,3).

With (1,2,3),(1,2,4), and (3,4), there is a redundancy, but there is no solution.

The algorithm that I came up with (for G = (1,2),(2,3),(3,4),(1,4) example) is

collect all the numbers from the groups x = (1,2,3,4)
for g in G:
       x = remove g in x # x = (3,4)
       find G' = (a set of (g' in (G - g))) that makes (G' + g = x) # G' = ((3,4))
       if find (G' + g) return (G',g) # return ((1,2)(3,4))

I know my algorithm has a lot of holes in it in terms of performance, and I assume this might be a well known problem. Any hints to this problem?

1
This is a classical NP-complete problem: en.wikipedia.org/wiki/Set_cover_problem - zch
This is actually a very hard, NP-complete, problem. In the worst case, you won't be able to do much better than the exponential-time brute force algorithm that tests all the possible combinations between the groups. - hugomg
Note: "With (1,2,3),(1,2,4), and (3,4), there is no redundancy." - (1,2,3) and (3,4) cover it.... - Karoly Horvath
@Karoly Horvath: The constraint is that the chosen groups (1,2,3) and (3,4) should not overlap. I updated the post to make it clear. - prosseek
With (1,2,3), (1,2,4), (3,4), there is redundancy, but there is no solution. - user2357112 supports Monica

1 Answers

0
votes

I found a working code in python from this site : http://www.cs.mcgill.ca/~aassaf9/python/algorithm_x.html

X = {1, 2, 3, 4, 5, 6, 7}
Y = {
    'A': [1, 4, 7],
    'B': [1, 4],
    'C': [4, 5, 7],
    'D': [3, 5, 6],
    'E': [2, 3, 6, 7],
    'F': [2, 7]}

def solve(X, Y, solution=[]):
    if not X:
        yield list(solution)
    else:
        c = min(X, key=lambda c: len(X[c]))
        for r in list(X[c]):
            solution.append(r)
            cols = select(X, Y, r)
            for s in solve(X, Y, solution):
                yield s
            deselect(X, Y, r, cols)
            solution.pop()

def select(X, Y, r):
    cols = []
    for j in Y[r]:
        for i in X[j]:
            for k in Y[i]:
                if k != j:
                    X[k].remove(i)
        cols.append(X.pop(j))
    return cols

def deselect(X, Y, r, cols):
    for j in reversed(Y[r]):
        X[j] = cols.pop()
        for i in X[j]:
            for k in Y[i]:
                if k != j:
                    X[k].add(i)

X = {j: set(filter(lambda i: j in Y[i], Y)) for j in X}                    
a = solve(X, Y)
for i in a: print i