2
votes

I am working on a subset sum problem, that needs to print the subset sum that's closest to the value, if equal then just print the value. Only positive integers

If there are multiple subset sums that are equally close to the value,

value = 10, subsetSum1 = 9, subsetSum2 = 11

print the sum thats less than the value.

So console app finds the equal subset sum perfectly, but how would I go about printing out the subset sum thats closest to the value.

 class Program
    {
        static int[] elements;
        static int value;
        static bool solution = false;

        static void Main()
        {
            value = 10000;
            Console.WriteLine("How many numbers ?");
            int elementsQty = int.Parse(Console.ReadLine());
            elements = new int[elementsQty];
            for (int i = 0; i < elementsQty; i++)
            {
                elements[i] = int.Parse(Console.ReadLine());
            }
            Console.WriteLine("\nOutput:");
            List<int> subset = new List<int>();
            GetSubset(0, subset);
            if (!solution)
                Console.WriteLine("No match");

            Console.ReadLine();
        }

        static void GetSubset(int index, List<int> myElements)
        {
            if (myElements.Sum() == value && myElements.Count > 0) 
            {
                Console.WriteLine(" {0} = {1}", string.Join(" + ", myElements), value);
                solution = true; 
            }

            for (int i = index; i < elements.Length; i++)
            {
                myElements.Add(elements[i]);
                GetSubset(i + 1, myElements); 
                myElements.RemoveAt(myElements.Count - 1);
            }
        }
    }
1
contiguous subset? And can the numbers be negative?user4843530
Your algorithm is rather inefficient. Using dynamic programming you can do it faster as well as easy solving your problem.Willem Van Onsem
Are you interested in the elements in the subset, or is the closest sum itself sufficient?Willem Van Onsem
just the closest sumuser33232
Only positive integers @AgapwIesuuser33232

1 Answers

6
votes

Your current solution makes use of backtracking. The problem with this approach is that the time complexity scales exponential. Which is not good: from the moment you enter a reasonable number of elements (like 100+), your program is doomed.

Given your list of integers are all (strict) positive, a better way to do this is using dynamic programming.

The idea is that if you search for a sum K, you define a memory of at most 2 K + 1 elements of lists. Initially all memory elements are invalid null, except the element index 0, where you store an empty collection.

So initially the memory looks like:

7 -> _
6 -> _
5 -> _
4 -> _
3 -> _
2 -> _
1 -> _
0 -> []

if b=8 (we will use b=8 through the remainder of this answer, but it is of course only an example).

with _ nothing (a null pointer), and [] an empty collection (regardless of what kind of collection).

Now for each element in your given set of numbers, you perform the following tasks. you iterate over all effective (not null) collections in memory. For each of the collections, you "upgrade" that collection: you make a copy, add the element to the collection, and store it into the index with the new sum. In case there is already a collection with that sum, you don't do anything. Such iteration can be implemented as follows:

for(int s = b-xi-1; s >= 0; s--) {
    if(cols[s+xi] == null && cols[s] != null) {
        List<int> cln = new List<int>(cols[s]);
        cln.Add(xi);
        cols[s+xi] = cln;
    }
}

with xi the element we wish to add. The if statement thus checks if the collection with sum s is effective (not null) and whether we have to create a new collection: whether a collection with the resulting sum does not yet exists. The for loop sets bounds: it is no use to upgrade a collection out of bounds: so both s+xi and s must be valid bounds. This means s has a range from 0 (included) to b-xi-1 (included). We have to iterate backwards because we could otherwise add our element xi a second time.

Indeed, take as example that the first element is 2, now we start iterating (wrongly) from 0 to 8-2-1=5. Now that means that if s=0, we "upgrade" the empty collection, so now the memory looks like:

7 -> _
6 -> _
5 -> _
4 -> _
3 -> _
2 -> [2]
1 -> _
0 -> []

([2] is a collection with 2), the next iteration of the for loop, s=1 and we see no collection with sum one exists, so we continue, but now s=2 and we have just constructed such collection. The point is that our algorithm does not do any bookmarking, and thus would add 2 a second time resulting in:

7 -> _
6 -> _
5 -> _
4 -> [2,2]
3 -> _
2 -> [2]
1 -> _
0 -> []

Now one could do two things: do bookmarking about which collections are constructed that iteration, but that requires additional work, or one can iterate from high to low. Since all integers xi are guaranteed to be positive, we know we cannot "upgrade" a collection in the downwards collection. If we perform an entire iteration in the correct way, afterwards the memory looks like:

7 -> _
6 -> _
5 -> _
4 -> _
3 -> _
2 -> [2]
1 -> _
0 -> []

If the next element is 1, the memory looks like:

7 -> _
6 -> _
5 -> _
4 -> _
3 -> [2,1]
2 -> [2]
1 -> [1]
0 -> []

Now given the next element is 3, we finally see the power of dynamic programming:

7 -> _
6 -> [2,1,3]
5 -> [2,3]
4 -> [1,3]
3 -> [2,1]
2 -> [2]
1 -> [1]
0 -> []

You notice that we haven't constructed a collection for 3 with [3], because there is already such collection. This might look not that much impressive. But all collections that originate from [2,1] will now not have a duplicate with [3] which would have been the case if one uses a backtracking algorithm.

After doing this for every element, the memory for each index either a way to create a subset with a sum that corresponds to the index, or null if no such subset could be constructed. Now you can simply take a look at the constructed collections, and pick the one closest to K. We know such collection differs at most K, because the empty collection has sum zero and thus differs K.

The entire story can be told with this algorithm:

static List<int> GetSubset(int value, IEnumerable<int> xs) {
    int b = 2*value;
    List<int>[] cols = new List<int>[b];
    cols[0] = new List<int>();
    foreach(int xi in xs) {
        for(int s = b-xi-1; s >= 0; s--) {
            if(cols[s+xi] == null && cols[s] != null) {
                List<int> cln = new List<int>(cols[s]);
                cln.Add(xi);
                cols[s+xi] = cln;
            }
        }
    }
    for(int d = 0; d < value; d++) {
        if(cols[value-d] != null) {
            return cols[value-d];
        } else if(cols[value+d] != null) {
            return cols[value+d];
        }
    }
    return cols[0];
}

The List<T> approach is not the most efficient: you can use a head-tail list approach (but unfortunately, the .NET library seems to lack this).

Demo (using csharp interactive shell):

$ csharp
Mono C# Shell, type "help;" for help

Enter statements below.
csharp> public class Foo {
      >  
      > public static List<int> GetSubset(int value, IEnumerable<int> xs) {
      >         int b = 2*value;
      >         List<int>[] cols = new List<int>[b];
      >         cols[0] = new List<int>();
      >         foreach(int xi in xs) {
      >             for(int s = b-xi-1; s >= 0; s--) {
      >                 if(cols[s+xi] == null && cols[s] != null) {
      >                     List<int> cln = new List<int>(cols[s]);
      >                     cln.Add(xi);
      >                     cols[s+xi] = cln;
      >                 }
      >             }
      >         }
      >         for(int d = 0; d < value; d++) {
      >             if(cols[value-d] != null) {
      >                 return cols[value-d];
      >             } else if(cols[value+d] != null) {
      >                 return cols[value+d];
      >             }
      >         }
      >         return cols[0];
      >     }
      > }
csharp>  
csharp> int[] items = new int[] {2,3,5,7};
csharp> Foo.GetSubset(8,items);
{ 3, 5 }
csharp> Foo.GetSubset(7,items); 
{ 2, 5 }
csharp> Foo.GetSubset(9,items); 
{ 2, 7 }
csharp> Foo.GetSubset(6,items); 
{ 2, 3 }
csharp> Foo.GetSubset(10,items); 
{ 2, 3, 5 }
csharp> Foo.GetSubset(11,items); 
{ 2, 3, 5 }

As you can see 6 cannot be formed with these integers, but one can come up with a set that sums up to 5.

An advantage of this approach is that you need to do the search only once: you could evidently call your method multiple times first search for the value K, then K+1, then K-1,etc. but the problem is that this will result in a computationally expensive method.