5
votes

How would you use dynamic programming to find the list of positive integers in an array whose sum is closest to but not equal to some positive integer K?

I'm a little stuck thinking about this.

3
The sum can be greater than or less than K, just not equal?Vaughn Cato
@vaughncato yes, and it has to be as close as possible to KÓscar López
@VaughnCato : Don't you mean "not just" instead of "just not"?Undreren
@Undreren: I suppose I should have said "bot not equal"Vaughn Cato

3 Answers

6
votes

The usual phrasing for this is that you're looking for the value closest to, but not exceeding K. If you mean "less than K", it just means that your value of K is one greater than the usual. If you truly mean just "not equal to K", then you'd basically run through the algorithm twice, once finding the largest sum less than K, then again finding the smallest sum greater than K, then picking the one of those whose absolute difference from K is the smallest.

For the moment I'm going to assume you really mean the largest sum less than or equal to K, since that's the most common formulation, and the other possibilities don't really have much affect on the algorithm.

The basic idea is fairly simple, though it (at least potentially) uses a lot of storage. We build a table with K+1 columns and N+1 rows (where N = number of inputs). We initialize the first row in the table to 0's.

Then we start walking through the table, and building the best value we can for each possible maximum value up to the real maximum, going row by row so we start with only a single input, then two possible inputs, then three, and so on. At each spot in the table, there are only two possibilities for the best value: the previous best value that doesn't use the current input, or else the current input plus the previous best value for the maximum minus the current input (and since we compute the table values in order, we'll always already have that value).

We also usually want to keep track of which items were actually used to produce the result. To do that, we set a Boolean for a given spot in the table to true if and only if we compute a value for that spot in the table using the new input for that row (rather than just copying the previous row's best value). The best result is in the bottom, right-hand corner, so we start there, and walk backward through the table one row at a time. When we get to a row where the Boolean for that column was set to true, we know we found an input that was used. We print out that item, and then subtract that from the total to get the next column to the left where we'll find the next input that was used to produce this output.

Here's an implementation that's technically in C++, but written primarily in a C-like style to make each step as explicit as possible.

#include <iostream>
#include <functional>

#define elements(array) (sizeof(array)/sizeof(array[0]))

int main() {

    // Since we're assuming subscripts from 1..N, I've inserted a dummy value
    // for v[0].
    int v[] = {0, 7, 15, 2, 1};

    // For the moment I'm assuming a maximum <= MAX.
    const int MAX = 17;

    // ... but if you want to specify K as the question implies, where sum<K, 
    // you can get rid of MAX and just specify K directly:
    const int K = MAX + 1;

    const int rows = elements(v);

    int table[rows][K] = {0};
    bool used[rows][K] = {false};

    for (int i=1; i<rows; i++)
        for (int c = 0; c<K; c++) {
            int prev_val = table[i-1][c];
            int new_val;

            // we compute new_val inside the if statement so we won't 
            // accidentally try to use a negative column from the table if v[i]>c
            if (v[i] <= c && (new_val=v[i]+table[i-1][c-v[i]]) > prev_val) {
                table[i][c] = new_val;
                used[i][c] = true;
            }
            else
                table[i][c] = prev_val;
        }

    std::cout << "Result: " << table[rows-1][MAX] << "\n";
    std::cout << "Used items where:\n";
    int column = MAX;
    for (int i=rows; i>-1; i--)
        if (used[i][column]) {
            std::cout << "\tv[" << i << "] = " << v[i] << "\n";
            column -= v[i];
        }

    return 0;
}

There are a couple of things you'd normally optimize in this (that I haven't for the sake of readability). First, if you reach an optimum sum, you can stop searching, so in this case we could actually break out of the loop before considering the final input of 1 at all (since 15 and 2 give the desired result of 17).

Second, in the table itself we only really use two rows at any given time: one current row and one previous row. The rows before that (in the main table) are never used again (i.e., to compute row[n] we need the values from row[n-1], but not row[n-2], row[n-3], ... row[0]. To reduce storage, we can make the main table be only two rows, and we swap between the first and second rows. A very C-like trick to do that would be to use only the least significant bit of the row number, so you'd replace table[i] and table[i-1] with table[i&1] and table[(i-1)&1] respectively (but only for the main table -- not when addressing the used table.

3
votes

Here is an example in python:

def closestSum(a,k):
  s={0:[]}
  for x in a:
    ns=dict(s)
    for j in s:
      ns[j+x]=s[j]+[x]
    s=ns
  if k in s:
    del s[k]
  return s[min(s,key=lambda i:abs(i-k))]

Example:

>>> print closestSum([1,2,5,6],10)
[1, 2, 6]

The idea is simply to keep track of what sums can be made from all previous elements as you go through the array, as well as one way to make that sum. At the end, you just pick the closest to what you want. It is a dynamic programming solution because it breaks the overall problem down into sub-problems, and uses a table to remember the results of the sub-problems instead of recalculating them.

1
votes

Cato's idea in Racket:

#lang racket
(define (closest-sum xs k)
  (define h (make-hash '([0 . ()])))
  (for* ([x xs] [(s ys) (hash-copy h)])
    (hash-set! h (+ x s) (cons x ys))
    (hash-set! h x (list x)))
  (when (hash-ref h k #f) (hash-remove! h k))
  (cdr (argmin (λ (a) (abs (- k (car a)))) (hash->list h))))

To get an even terser program, one can grab terse-hash.rkt from GitHub and write:

(define (closest-sum xs k)
  (define h {make '([0 . ()])})
  (for* ([x xs] [(s ys) {copy h}])
    {! h (+ x s) (cons x ys)}
    {! h x (list x)})
  (when {h k #f} {remove! h k})
  (cdr (argmin (λ (a) (abs (- k (car a)))) {->list h})))