Finding the smallest sub-array that adds up to a given sum (Duplicates are allowed)

votes

I want to find the smallest sub-array that adds up to a given target.
My inputs are the an input array and the target sum.

I know this problem has been asked many times, but in most cases, people are trying to find every possible combination that adds up to their target or their solution does not allow duplicates.

In my case, I want to find only the smallest sub-array and duplicating data from the input array is allowed.

For example, given an input array of [1,4,10,20,35] and a target of 17, I'd expect an output array of [10,4,1,1,1].
So my algorithm is allowed to duplicate any of the values from the input array in finding the output.

Here's what I have so far:

public static ArrayList<Integer> representNumberAsSumOfArrayElements(ArrayList<Integer> inputs, int target)
{
    ArrayList<Integer> result = new ArrayList<>();

    //First sort the input array in descending order
    Collections.sort(inputs, Collections.reverseOrder());
    int i = 0;

    //Iterate through the input array and break down the target into a sum of the input array
    while (i < inputs.size() && target > 0) {
        if(target >= inputs.get(i) ) {
            result.add(inputs.get(i));
            target = target - inputs.get(i);
        } else {
            i++;
        }
    }
    return result;
}

and just a simple driver to test the code on 100 targets:

public static void main(String[] args) {
    ArrayList<Integer> inputs = new ArrayList<>(Arrays.asList( 1, 4, 10, 20, 35, 56, 84));
    int n = 100;
    ArrayList<Integer> sumArray = new ArrayList<>();
    for (int i = 0; i <= n; i++)
    {
        sumArray = representNumberAsSumOfArrayElements(inputs, i); // O(n)
        System.out.println(i + " written as a sum of array elements " + sumArray);
    }
}

I've implemented an O(n) algorithm that works for most values, but in some cases, I'm getting the wrong result.
For example, when I run my code with an input of [1,4,10,20,35,56,84], and a target sum of 69, the correct output would be [35,20,10,4] but my algorithm outputs [56,10,1,1,1].
I understand why my algorithm is wrong but I'm not sure how to fix it.

javaarraysalgorithm

Duplicates allowed portion makes it a little complicated though. I have done this in JS. I guess it's a very efficient algorithm and easily be converted into Java. The only thing is that it gives you all possibilities. You only left is to filter the shortest one. – Redu

geeksforgeeks.org/find-subarray-with-given-sum <-- for an explination and code on how to do the problem – beastlyCoder

@Aaron I don't think that algorithm allows multiple use of an item. In fact it doesn't even allow you to use non consecutive items. – Redu

I have just tested my code mentioned in my above comment on your given set [1,4,10,20,35,56,84] and target being 69 . Just filtered the result set with result.reduce((p,c) => p.length <= c.length ? p : c); and it returned [4, 10, 20, 35] in 32 ms. Sorry again that i don't have any Java knowledge to present it's only JavaScript. – Redu

@redu This is helpful. I'm conversant in JS so I'll try and rewrite it in Java. – Kaylo

4 Answers

votes

First make an array of prefix sum such that prefSum[i] gives sum of all elements of the given array from index 0 to i(both inclusive) .If your array contains all positive integer then prefSum array is sorted and you can do binary search. So scan the the prefSum array from 0 to length and do binary search in between 0 to (i-1) if your current index is i and try to find largest j where j in between 0 to i-1 such that prefSum[i]-prefSum[j]=given target. Overall complexity will be nlogn.

votes

Term subarray usually assumes contiguous piece of array (that is why some anwerers mean another problem), but your sorting tells us it is not the case, and you need the smallest list of items in arbitrary order to solve subset sum problem with duplicates.

You can solve current problem with dynamic programming using table approach (while your codes exploits greedy method - is not applicable in general case). To get the smallest subset, you just need to choose subproblem solution with shorter list of values.

Seems this Python code works right (is not well tested).

def best(lst, summ):
    lst = sorted(lst, reverse = True)
    a = [[] for _ in range(summ + 1)]  # list of lists
    a[0].append(-1)             # fake value to provide  valid zero entry
    for l in lst:
        for i in range(summ + 1 - l):
            t = len(a[i])
            if t:
                if (len(a[i + l]) == 0) or (t < len(a[i + l])):
                    a[i + l] = a[i] +[l]   # concatenate lists
    return a[summ][1:]   #remove fake -1

 for s in range(55, 71):
     print(s, best([1,4,10,20,35,56,84], s))

55 [35, 20]
56 [56]
57 [56, 1]
58 [56, 1, 1]
59 [35, 20, 4]
60 [56, 4]
61 [56, 4, 1]
62 [56, 4, 1, 1]
63 [35, 20, 4, 4]
64 [56, 4, 4]
65 [35, 20, 10]
66 [56, 10]
67 [56, 10, 1]
68 [56, 10, 1, 1]
69 [35, 20, 10, 4]
70 [35, 35]

Note that we don't need to store lists themselves - I added them for debugging and simplicity. We need to store only the last added value and count of items in given sum.
Solution with unwinding list:

def best1(lst, summ):
    a = [(0,0)] * (summ + 1)   # list contains tuples (value, bestcount)
    a[0] = (-1,1)
    for l in lst:
        for i in range(summ + 1 - l):
            t = a[i][1]
            if t:
                if (a[i + l][1] == 0) or (t < a[i + l][1]):
                    a[i + l] = (l, t + 1)
    res = []
    t = summ
    while a[t][1] > 1:
        res.append(a[t][0])
        t = t - a[t][0]

    return res

votes

Complexity for breadth-first (BFS) approaches is O(n*k), where n is the number of unique elements in the array and k is the length of the shortest answer. Pseudocode follows:

1. remove duplicates from the input array, A:
   can be done by copying it into a set in O(|A|)
2. build a queue of lists Q;
   store the sum of elements as its 0th element, 
   and add an initially-empty list with a sum of 0
3. while Q is not empty, 
     extract the first list of Q, L
     for each element e in A,
       if L[0] + e == sum, 
           you have found your answer: the elements of L with e
       if L[0] + e < sum, 
           insert a new list (L[0] + e, elements of L, e) at the end of Q
4. if you reach this point, there is no way to add up to the sum with elements of A

Not using the 0st element of lists as the sum would incur a cost of recalculating the sum of its elements. In this sense, storing the sum is a form of dynamic programming (= reuse of prior answers to avoid recomputing them).

This guarantees that the first list to add up to sum is also of the shortest possible length (because all lists in the queue are evaluated in ascending order of length). You can improve the run-time by adding a heuristic to choose which of the same-length lists to evaluate first (for example, whichever is closest to the sum); however, this would only work for specific inputs, and the worst-case complexity would remain the same.

votes

Since you are using Java, I would add an implementation in Java, using dynamic programming (it's recursion, in this case).

Code

SubSumDupComb.java:

import java.util.*;

public class SubSumDupComb {
    /**
     * Find shortest combination add to given sum.
     *
     * @param arr input array,
     * @param sum target sum,
     * @return
     */
    public static int[] find(int[] arr, int sum) {
        // System.out.printf("input arr: %s, sum: %d\n", Arrays.toString(arr), sum);
        List<Integer> list = find(arr, 0, sum, new ArrayList<>());
        // System.out.printf("result: %s\n", list);

        return listToArray(list);
    }

    /**
     * Find shortest combination add to given sum, start from given index.
     *
     * @param arr        input array,
     * @param start      start index, for further search,
     * @param sum        remain sum,
     * @param prefixList prefix list,
     * @return
     */
    private static List<Integer> find(int[] arr, int start, int sum, List<Integer> prefixList) {
        if (sum == 0) return prefixList; // base case,
        if (start >= arr.length || sum < 0) return null; // bad case,

        // exclude current index,
        List<Integer> shortestExcludeList = find(arr, start + 1, sum, prefixList);

        // include current index,
        List<Integer> includePrefixList = new ArrayList<>(prefixList);
        includePrefixList.add(arr[start]);
        List<Integer> shortestIncludeList = find(arr, start, sum - arr[start], includePrefixList);

        if (shortestIncludeList == null && shortestExcludeList == null) return null; // both null,
        if (shortestIncludeList != null && shortestExcludeList != null) // both non-null,
            return shortestIncludeList.size() < shortestExcludeList.size() ? shortestIncludeList : shortestExcludeList; // prefer to include elements with larger index,
        else return shortestIncludeList == null ? shortestExcludeList : shortestIncludeList; // exactly one null,
    }

    /**
     * Find shortest combination add to given sum, with cache.
     *
     * @param arr input array,
     * @param sum target sum,
     * @return
     */
    public static int[] findWithCache(int[] arr, int sum) {
        // System.out.printf("input arr: %s, sum: %d\n", Arrays.toString(arr), sum);
        List<Integer> list = findWithCache(arr, 0, sum, new ArrayList<>(), new HashMap<>());
        // System.out.printf("result: %s\n", list);

        return listToArray(list);
    }

    /**
     * Find shortest combination add to given sum, start from given index, with cache.
     *
     * @param arr        input array,
     * @param start      start index, for further search,
     * @param sum        remain sum,
     * @param prefixList prefix list,
     * @return
     */
    private static List<Integer> findWithCache(int[] arr, int start, int sum, List<Integer> prefixList, Map<Integer, Map<Integer, List<Integer>>> cache) {
        if (sum == 0) return prefixList; // base case,
        if (start >= arr.length || sum < 0) return null; // bad case,

        // check cache,
        Map<Integer, List<Integer>> cacheAtStart;
        if ((cacheAtStart = cache.get(start)) != null && cacheAtStart.containsKey(sum)) { // cache hit, tips: the cashed list could be null, which indicate no result,
            // System.out.printf("hit cache: start = %d, sum = %d, cached list: %s\n", start, sum, cacheAtStart.get(sum));
            return cacheAtStart.get(sum);
        }

        // exclude current index, tips: should call this first,
        List<Integer> shortestExcludeList = findWithCache(arr, start + 1, sum, prefixList, cache);

        // include current index,
        List<Integer> includePrefixList = new ArrayList<>(prefixList);
        includePrefixList.add(arr[start]);
        List<Integer> shortestIncludeList = findWithCache(arr, start, sum - arr[start], includePrefixList, cache);

        List<Integer> resultList;

        if (shortestIncludeList == null && shortestExcludeList == null) resultList = null; // both null,
        else if (shortestIncludeList != null && shortestExcludeList != null) // both non-null,
            resultList = shortestIncludeList.size() < shortestExcludeList.size() ? shortestIncludeList : shortestExcludeList; // prefer to include elements with larger index,
        else
            resultList = (shortestIncludeList == null ? shortestExcludeList : shortestIncludeList); // exactly one null,

        // add to cache,
        if (cacheAtStart == null) { // init cache at given start,
            cacheAtStart = new HashMap<>();
            cache.put(start, cacheAtStart);
        }
        cacheAtStart.put(sum, resultList == null ? null : resultList); // add this result to cache,
        // System.out.printf("add cache: start = %d, sum = %d, list: %s\n", start, sum, resultList);

        return resultList;
    }

    /**
     * List to array.
     *
     * @param list
     * @return
     */
    private static int[] listToArray(List<Integer> list) {
        if (list == null) return null; // no solution,

        // list to array,
        int[] result = new int[list.size()];
        for (int i = 0; i < list.size(); i++) {
            result[i] = list.get(i);
        }

        return result;
    }
}

SubSumDupCombTest.java:
(Test case, via TestNG)

import org.testng.Assert;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;

import java.util.Arrays;

public class SubSumDupCombTest {
    private int[] arr;
    private int sum;
    private int[] expectedResultArr;

    private int[] arr2;
    private int sum2;
    private int sum2NoSolution;
    private int[] expectedResultArr2;

    @BeforeClass
    public void setUp() {
        // init - arr,
        arr = new int[]{1, 4, 10, 20, 35};
        sum = 17;
        expectedResultArr = new int[]{1, 4, 4, 4, 4};
        Arrays.sort(expectedResultArr);

        // init - arr2,
        arr2 = new int[]{14, 6, 10};
        sum2 = 40;
        sum2NoSolution = 17;
        expectedResultArr2 = new int[]{10, 10, 10, 10};
        Arrays.sort(expectedResultArr2);
    }

    @Test
    public void test_find() {
        // arr
        int[] resultArr = SubSumDupComb.find(arr, sum);
        Arrays.sort(resultArr);
        Assert.assertTrue(Arrays.equals(resultArr, expectedResultArr));

        // arr2
        int[] resultArr2 = SubSumDupComb.find(arr2, sum2);
        Arrays.sort(resultArr2);
        Assert.assertTrue(Arrays.equals(resultArr2, expectedResultArr2));
    }

    @Test
    public void test_find_noSolution() {
        Assert.assertNull(SubSumDupComb.find(arr2, sum2NoSolution));
    }

    @Test
    public void test_findWithCache() {
        // arr
        int[] resultArr = SubSumDupComb.findWithCache(arr, sum);
        Arrays.sort(resultArr);
        Assert.assertTrue(Arrays.equals(resultArr, expectedResultArr));

        // arr2
        int[] resultArr2 = SubSumDupComb.findWithCache(arr2, sum2);
        Arrays.sort(resultArr2);
        Assert.assertTrue(Arrays.equals(resultArr2, expectedResultArr2));
    }

    @Test
    public void test_findWithCache_noSolution() {
        Assert.assertNull(SubSumDupComb.findWithCache(arr2, sum2NoSolution));
    }
}

Explanation:

find()
It's pure recursion.
Compleixty:
- Time:: O(t^n) // in worst case,
- Space: O(n) // used by recursive method stack,
Where:
- t, is average time of an element is included.
findWithCache()
Use cache for each pair of (start, sum).
Compleixty:
- Time:: O(n * s)
- Space: O(n * s) // used by cache, and recursive method stack,
Where:
- s, is count of possible sum in the middle.

Tips:

The result prefer numbers with larger indices, when there are multiple possible shortest results.