1
votes

I have looked at several other questions on permutations and this variant does not seem to appear. I am looking for a simple way to generate a biased permutation. Here's what I mean.

Suppose I have two lists (though I need to solve for N lists):

l1 = [a, b, c]
l2 = [d, e, f]

Permutations of these lists would look like [(a,d), (a,e), (a,f), (b,d), (b,e), (b,f) ...]. However, in my world the elements of permutations are scored and summed to value the arrangement. Supposing a, d and e are worth 2 points and b, c and f are worth 1 point. The value of some sample permutations are then:

(a,d) = 4
(a,e) = 4 
(a,f) = 3
(c,f) = 2

I need to generate all permutations but I would like to generate high value permutations before lower value permutations.

Assuming the elements of each list are sorted in descending value order, is there a nice way to generate the permutations in value-order?

(Obviously I can generate all of the permutations and sort them but I would prefer to write a generator as the number of permutations can be large.)

2
why is b before d? Also do you want the product or permutations?Padraic Cunningham
Should (b, c) really be in the list? My understanding was that you were mixing the two.Slater Victoroff
voting to close as it is not very clear what you expect as outputPadraic Cunningham
@PadraicCunningham I am looking for the permutations. I fixed the example, looks clear now.Toaster
@SlaterTyranus That was a typo, should have been c,f. The lists are independent, should not mix.Toaster

2 Answers

1
votes

This should be pretty easily approachable with a simple greedy-style algorithm. That said, I'm assuming you have access to the specific values rather than just a sorted list of the values. Also assuming that it comes sorted.

l1 = [(a, 2), (b, 1), (c, 1)]
l2 = [(d, 2), (e, 2), (f, 1)]

The truth is it's pretty non-trivial, but here's how you can approach the problem (code will probably come later, because as I said, it's actually non-trivial.)

Assuming you have three possible actions at any given point:

  1. (highest in next_values (see below), highest remaining entry of l2 for l1)
  2. (Current entry of l1, next entry of l2) # Assumes you're iterating through l2
  3. (Next entry of l1, highest remaining entry of l2)

Then you must simply keep track of the value of each of those three, over time and choose the optimal at each timestep. It's not entirely lazy as you have to update these three values at each timestep, it's pretty close.

To actually implement this there's one data structure we have to keep around:

next_entries: {*l1: last entry of l2 explored}
next_values: {*l1: l1 entry + next l2 entry}

At which point the calculations for the three possible points above can be done. Again, can produce code, will probably do, but it's probably ~20 dense lines to do in a good readable way.

1
votes

For example, lets take l1={6,4,3,1} and l2={5,4,1}. Plot them as horizontal and vertical lines on 2D plane.

Plot and swip line

Then points of interest are all intersections. We should report those intersections in order an imaginary sweep line moving from (inf, inf) to (0, 0) touches them. Notice that a point lying on the horizontal line can't be reported earlier than another point on the same line which is righter. So for each horizontal line we must check only rightmost point. From all those points, we must choose one with the greatest sum of coordinates. This can be done with heap data structure.

Initially, we place all points lying on the rightmost vertical line into the heap. Then we extract the top point from the heap, yield it and finally put its left neighbour into the heap (if it has one). So heap always contains at most len(l1) elements, and every new generated point cost us O(log(len(l1))). Solution can be improved if we choose l1 to be smallest list from two given.

Here is example solution:

import heapq

a = [("a", 6), ("b", 4), ("c", 3), ("d", 1)]
b = [("e", 5), ("f", 5), ("g", 4), ("h", 2)]

class Pair:
    def __init__(self, i, j, value):
        self.i = i
        self.j = j
        self.value = value
    def __cmp__(self, other):
        return other.value - self.value

def solution(a, b):
    heap = []
    for i in range(len(a)):
        heapq.heappush(heap, Pair(i, 0, a[i][1] + b[0][1]))
    while len(heap) > 0:
        pair = heapq.heappop(heap)
        yield (a[pair.i], b[pair.j], pair.value)
        if pair.j + 1 < len(b):
            heapq.heappush(heap, Pair(pair.i, pair.j + 1, a[pair.i][1] + b[pair.j + 1][1]))

for (a, b, value) in solution(a, b):
    print ("%s %s -> %d" % (a, b, value))

Things get worse when we are going into higher dimensions (more than 2 lists to combine). It can be solved on top of the 2D solution, with memoization, so we first build an answer for l1,l2 as lazy-list-like data structure, and then apply the same algorithm again, for this memoized list and l3 as arguments, and so on. One last step that must be taken - we should either always use array with smaller length as l1, or get rid of pushing all elements of l1 into heap in the beginning.

Complete code example for N lists here, as it too long.