6
votes

A space optimization for the 0/1 knapsack dynamic programming algorithm is to use a 1-d array (say, A) of size equal to the knapsack capacity, and simply overwrite A[w] (if required) at each iteration i, where A[w] denotes the total value if the first i items are considered and knapsack capacity is w. If this optimization is used, is there a way to reconstruct the list of items picked, perhaps by saving some extra information at each iteration of the DP algorithm? For example, in the Bellman Ford Algorithm a similar space optimization can be implemented, and the shortest path can still be reconstructed as long as we keep a list of the predecessor pointers, ie the last hop (or first, depending on if a source/destination oriented algorithm is being written).

For reference, here is my C++ function for the 0/1 knapsack problem using dynamic programming where I construct a 2-d vector ans such that ans[i][j] denotes the total value considering the first i items and knapsack capacity j. I reconstruct the items picked by reverse traversing this vector ans:

void knapsack(vector<int> v,vector<int>w,int cap){
 //v[i]=value of item i-1
 //w[i]=weight of item i-1, cap=knapsack capacity
 //ans[i][j]=total value if considering 1st i items and capacity j
 vector <vector<int> > ans(v.size()+1,vector<int>(cap+1));

 //value with 0 items is 0
 ans[0]=vector<int>(cap+1,0);

 //value with 0 capacity is 0
 for (uint i=1;i<v.size()+1;i++){
    ans[i][0]=0;
 }

 //dp
 for (uint i=1;i<v.size()+1;i++) {
    for (int x=1;x<cap+1;x++) {
        if (ans[i-1][x]>=ans[i-1][x-w[i-1]]+v[i-1]||x<w[i-1])
            ans[i][x]=ans[i-1][x];
        else {
            ans[i][x]=ans[i-1][x-w[i-1]]+v[i-1];
        }
    }
 }
 cout<<"Total value: "<<ans[v.size()][cap]<<endl;

 //reconstruction
 cout<<"Items to carry: \n";
 for (uint i=v.size();i>0;i--) {
    for (int x=cap;x>0;x--) {
        if (ans[i][x]==ans[i-1][x]) //item i not in knapsack
            break;
        else if (ans[i][x]==ans[i-1][x-w[i-1]]+v[i-1]) { //item i in knapsack
            cap-=w[i-1];
            cout<<i<<"("<<v[i-1]<<"), ";
            break;
        }
    }
 }
 cout<<endl;
}
3
I'm a bit confused about the question; you describe a one-dimensional state space A, but the implementation uses a two-dimensional state space ans? So, basically the space_optimized implementaion also uses ans, but discards the rows for smaller values of i?Codor
@Codor Right, I'm just showing how we reconstruct the list of items from a 2-d vector. I don't know how we'll do it if it's a 1-d vector. Calculating just the total value with that 1-d vector is straightforward though, we would just have ans[x] instead of ans[i][x] and keep overwriting ans[x]. If item i is not included ans[x] remains unchanged for the iteration otherwise we replace it with ans[i]+v[i-1].kabir
In your DP calculation, test x<w[i-1] before the other half of the || condition, so that short-circuiting will prevent ans[i-1][x-w[i-1]] trying to access an array element at a negative index, and thus possibly crashing.j_random_hacker

3 Answers

3
votes

The following is a C++ implementation of yildizkabaran's answer. It adapts Hirschberg's clever divide & conquer idea to compute the answer to a knapsack instance with n items and capacity c in O(nc) time and just O(c) space:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

// Returns a vector of (cost, elem) pairs.
vector<pair<int, int>> optimal_cost(vector<int> const& v, vector<int> const& w, int cap) {
    vector<pair<int, int>> dp(cap + 1, { 0, -1 });

    for (int i = 0; i < size(v); ++i) {
        for (int j = cap; j >= 0; --j) {
            if (w[i] <= j && dp[j].first < dp[j - w[i]].first + v[i]) {
                dp[j] = { dp[j - w[i]].first + v[i], i };
            }
        }
    }

    return dp;
}

// Returns a vector of item labels corresponding to an optimal solution, in increasing order.
vector<int> knapsack_hirschberg(vector<int> const& v, vector<int> const& w, int cap, int offset = 0) {
    if (empty(v)) {
        return {};
    }

    int mid = size(v) / 2;
    auto subSol1 = optimal_cost(vector<int>(begin(v), begin(v) + mid), vector<int>(begin(w), begin(w) + mid), cap);
    auto subSol2 = optimal_cost(vector<int>(begin(v) + mid, end(v)), vector<int>(begin(w) + mid, end(w)), cap);

    pair<int, int> best = { -1, -1 };
    for (int i = 0; i <= cap; ++i) {
        best = max(best, { subSol1[i].first + subSol2[cap - i].first, i });
    }

    vector<int> solution;
    if (subSol1[best.second].second != -1) {
        int iChosen = subSol1[best.second].second;
        solution = knapsack_hirschberg(vector<int>(begin(v), begin(v) + iChosen), vector<int>(begin(w), begin(w) + iChosen), best.second - w[iChosen], offset);
        solution.push_back(subSol1[best.second].second + offset);
    }
    if (subSol2[cap - best.second].second != -1) {
        int iChosen = mid + subSol2[cap - best.second].second;
        auto subSolution = knapsack_hirschberg(vector<int>(begin(v) + mid, begin(v) + iChosen), vector<int>(begin(w) + mid, begin(w) + iChosen), cap - best.second - w[iChosen], offset + mid);
        copy(begin(subSolution), end(subSolution), back_inserter(solution));
        solution.push_back(iChosen + offset);
    }

    return solution;
}
2
votes

Even though this is an old question I recently ran into the same problem so I figured I would write my solution here. What you need is Hirschberg's algorithm. Although this algorithm is written for reconstructing edit distances, the same principle applies here. The idea is that when searching for n items in capacity c, the knapsack state at (n/2)th item corresponding to the final maximum value is determined in the first scan. Let's call this state weight_m and value_m. This can be with keeping track of an additional 1d array of size c. So the memory is still O(c). Then the problem is divided into two parts: items 0 to n/2 with a capacity of weight_m, and items n/2 to n with a capacity of c-weight_m. The reduced problems in total is of size nc/2. Continuing this approach we can determine the knapsack state (occupied weight and current value) after each item, after which we can simply check to see which items were included. This algorithm completes in O(2nc) while using O(c) memory, so in terms of big-O nothing is changed even though the algorithm is at least twice as slow. I hope this helps to anyone who is facing a similar problem.

1
votes

To my understanding, with the proposed solution, it is effectively impossible to obtain the set of involved items for a certain objective value. The set of items can be obtained by either generating the discarded rows again or maintain a suitable auxiliary data structure. This could be done by associating each entry in A with the list of items from which it was generated. However, this would require more memory than the initially proposed solution. Approaches for backtracking for knapsack problems is also briefly discussed in this journal paper.