2
votes

Say I have two sets of data, each entry consisting of a weight. Each set is ordered by weight ascending. I'm going to list a couple example sets:

Set 0: {4, 8, 19, 25, 25, 26}    
Set 1: {1, 2, 3, 8, 20, 27}

I want to find every possible pair between these two sets, but I want to find those pairs in order of their summed weight from smallest to largest. So 4+1, 4+2, 4+3, 8+1, etc. You can assume the sets are c++ std::multiset, but beyond that I think it'd be the same in any language.

Now, I thought about having 2 iterators and iterating through starting with the first paired with each second in order, etc but that won't compute each pair in order from smallest summed weight up, since the Set 0: 4 + Set 1: 8 > Set 0: 8 + Set 1: 1, in this example. I could always dump the result into a container that does the sorting for me, but that seems inefficient. I have other optimizations that hinge on being able to do this, as well. Is there a clever way to do this without an extra sort at the end?

edit: I need to get every possible pairing, so it's not as simple as incrementing one iterator or the other to get the smallest sum. That will miss most (half?) of the pairs. Although maybe with an iterator stack of some kind...

2
One solution uses heap: push all sum of first element in one set + all elements in the other set in, then when you pop, you will push a new element in that is sum of the next element from one set and the same element from the other set. Just curious, but are you solving the sum of 4 values whose sum = 0 problem spoj.pl/problems/SUMFOUR?nhahtdh
Nope, I just want to iterate through every pair of weights in order. The weights are actually map keys to objects that I want to perform a function on, in this specific order.user173342
It's the same algorithm describe here, just a bit clearer than what I wrote above: online-judge.uva.es/board/…nhahtdh

2 Answers

2
votes

Let us denote the 2 sorted lists A (size m) and B (size n).

The algorithm:

  1. Calculate the sum of A[0] and B[i] for all i = 0 to n - 1. You need to identify which element from list A the sum includes - one way is to attach the index (which is 0 for all sums here). Push all the pairs into a min-heap which is ordered by the sum.
  2. Pop the heap, and take out the sum. Use the index attached to calculate the sum with the next element in list A. If the index is m - 1 already, no need to do anything; otherwise increment the index and push it back into the heap.
  3. Repeat step 2 until the heap is empty (for all m * n values).

Step 1 will be O(n log n).

Step 2 is at most O(log n) since the size of the heap may decrease and never increase. Repeating step 2 by m * n times result in time complexity of O(mn log n).

Overall complexity is O(mn log n).

By using the smaller list as list B in the algorithm above, we can achieve a slightly better time complexity (we only need to manage a small heap instead of a big heap).

An implementation using std::priority_queue (by Stacker):

#include <iostream>
#include <set>
#include <queue>
#include <limits>

std::multiset<int> set1 {4, 8, 19, 25, 25, 26};
std::multiset<int> set2 {1, 2, 3, 8, 20, 27};

struct Node
{
  Node(std::multiset<int>::const_iterator set1Iterator, int set2Value) :
    set1Iterator(set1Iterator),
    set2Value(set2Value),
    sum(*set1Iterator + set2Value)
  {
  }

  bool operator < (const Node& other) const
  {
    return sum > other.sum; // Reverse order as std::priority_queue sorts for the greatest value
  }

  std::multiset<int>::const_iterator set1Iterator;
  int set2Value;
  int sum;
};

int main()
{
  std::priority_queue<Node> heap;
  for (auto i = set2.begin(); i != set2.end(); ++i)
    heap.push(Node(set1.begin(), *i));

  while (!heap.empty())
  {
    Node min(heap.top());
    heap.pop();

    std::cout << *min.set1Iterator << " + " << min.set2Value << " = " <<
      min.sum << std::endl;
    if (++min.set1Iterator != set1.end())
    {
      min.sum = *min.set1Iterator + min.set2Value;
      heap.push(min);
    }
  }
  return 0;
}
2
votes
#include <iostream>
#include <set>
#include <vector>
#include <limits>

std::multiset<int> set1 {4, 8, 19, 25, 25, 26};
std::multiset<int> set2 {1, 2, 3, 8, 20, 27};

int main()
{
  std::vector<std::pair<std::multiset<int>::const_iterator,
    std::multiset<int>::const_iterator>> setIterators;
  for (auto i = set1.begin(); i != set1.end(); ++i)
    setIterators.push_back(std::make_pair(i, set2.begin()));

  for (std::size_t count = set1.size() * set2.size(); count != 0; --count)
  {
    int minValue = std::numeric_limits<int>::max();
    auto minElement = setIterators.begin();
    for (auto i = setIterators.begin(); i != setIterators.end(); ++i)
    {
      if (i->second == set2.end())
        continue;
      int sum = *i->first + *i->second;
      if (sum < minValue)
      {
        minValue = sum;
        minElement = i;
      }
    }
    std::cout << *minElement->first << " + " << *minElement->second << " = " <<
      minValue << std::endl;
    ++minElement->second;
  }
  return 0;
}

Output:

$ g++ -std=c++11 main.cpp -o main && ./main
4 + 1 = 5
4 + 2 = 6
4 + 3 = 7
8 + 1 = 9
8 + 2 = 10
8 + 3 = 11
4 + 8 = 12
...
26 + 27 = 53