python - Algorithm to find which number in a list sum up to a certain number

Question

I have a list of numbers. I also have a certain sum. The sum is made from a few numbers from my list (I may/may not know how many numbers it's made from). Is there a fast algorithm to get a list of possible numbers? Written in Python would be great, but pseudo-code's good too. (I can't yet read anything other than Python :P )

Example

list = [1,2,3,10]
sum = 12
result = [2,10]

NOTE: I do know of Algorithm to find which numbers from a list of size n sum to another number (but I cannot read C# and I'm unable to check if it works for my needs. I'm on Linux and I tried using Mono but I get errors and I can't figure out how to work C# :(
AND I do know of algorithm to sum up a list of numbers for all combinations (but it seems to be fairly inefficient. I don't need all combinations.)

Googling for "subset sum" will probably give some useful results. — Jerry Coffin
As a side note, if you know Python well it shouldn't be that hard to read languages like C# and at least figure out the big picture of what the code is doing. — Sasha Chedygov
Concerning > I don't need all combinations: Since this problem is known to be NP-complete, at last you probably will have to enumerate all possibilities. — phimuemue
@musicfreak: I'm still in the learning stages. I attempted to rewrite it in Python but it didn't seem to work with a set of 4 numbers and 1 sum; so I'm assuming I didn't write it right. — avacariu

jbernadas jbernadas · Accepted Answer · 2010-08-06T05:16:57

This problem reduces to the 0-1 Knapsack Problem, where you are trying to find a set with an exact sum. The solution depends on the constraints, in the general case this problem is NP-Complete.

However, if the maximum search sum (let's call it S) is not too high, then you can solve the problem using dynamic programming. I will explain it using a recursive function and memoization, which is easier to understand than a bottom-up approach.

Let's code a function f(v, i, S), such that it returns the number of subsets in v[i:] that sums exactly to S. To solve it recursively, first we have to analyze the base (i.e.: v[i:] is empty):

S == 0: The only subset of [] has sum 0, so it is a valid subset. Because of this, the function should return 1.
S != 0: As the only subset of [] has sum 0, there is not a valid subset. Because of this, the function should return 0.

Then, let's analyze the recursive case (i.e.: v[i:] is not empty). There are two choices: include the number v[i] in the current subset, or not include it. If we include v[i], then we are looking subsets that have sum S - v[i], otherwise, we are still looking for subsets with sum S. The function f might be implemented in the following way:

def f(v, i, S):
  if i >= len(v): return 1 if S == 0 else 0
  count = f(v, i + 1, S)
  count += f(v, i + 1, S - v[i])
  return count

v = [1, 2, 3, 10]
sum = 12
print(f(v, 0, sum))

By checking f(v, 0, S) > 0, you can know if there is a solution to your problem. However, this code is too slow, each recursive call spawns two new calls, which leads to an O(2^n) algorithm. Now, we can apply memoization to make it run in time O(n*S), which is faster if S is not too big:

def f(v, i, S, memo):
  if i >= len(v): return 1 if S == 0 else 0
  if (i, S) not in memo:  # <-- Check if value has not been calculated.
    count = f(v, i + 1, S, memo)
    count += f(v, i + 1, S - v[i], memo)
    memo[(i, S)] = count  # <-- Memoize calculated result.
  return memo[(i, S)]     # <-- Return memoized value.

v = [1, 2, 3, 10]
sum = 12
memo = dict()
print(f(v, 0, sum, memo))

Now, it is possible to code a function g that returns one subset that sums S. To do this, it is enough to add elements only if there is at least one solution including them:

def f(v, i, S, memo):
  # ... same as before ...

def g(v, S, memo):
  subset = []
  for i, x in enumerate(v):
    # Check if there is still a solution if we include v[i]
    if f(v, i + 1, S - x, memo) > 0:
      subset.append(x)
      S -= x
  return subset

v = [1, 2, 3, 10]
sum = 12
memo = dict()
if f(v, 0, sum, memo) == 0: print("There are no valid subsets.")
else: print(g(v, sum, memo))

Disclaimer: This solution says there are two subsets of [10, 10] that sums 10. This is because it assumes that the first ten is different to the second ten. The algorithm can be fixed to assume that both tens are equal (and thus answer one), but that is a bit more complicated.

python - Algorithm to find which number in a list sum up to a certain number

5 Answers