python - Generate all permutations of a list without adjacent equal elements

Question

When we sort a list, like

a = [1,2,3,3,2,2,1]
sorted(a) => [1, 1, 2, 2, 2, 3, 3]

equal elements are always adjacent in the resulting list.

How can I achieve the opposite task - shuffle the list so that equal elements are never (or as seldom as possible) adjacent?

For example, for the above list one of the possible solutions is

p = [1,3,2,3,2,1,2]

More formally, given a list a, generate a permutation p of it that minimizes the number of pairs p[i]==p[i+1].

Since the lists are large, generating and filtering all permutations is not an option.

Bonus question: how to generate all such permutations efficiently?

This is the code I'm using to test the solutions: https://gist.github.com/gebrkn/9f550094b3d24a35aebd

UPD: Choosing a winner here was a tough choice, because many people posted excellent answers. @VincentvanderWeele, @David Eisenstat, @Coady, @enrico.bacis and @srgerg provided functions that generate the best possible permutation flawlessly. @tobias_k and David also answered the bonus question (generate all permutations). Additional points to David for the correctness proof.

The code from @VincentvanderWeele appears to be the fastest.

So you care only for equality? something like [1, 2, 1, 3, 1, 4, 1, 5] is exactly the same as [1, 3, 1, 2, 1, 4, 1, 5] by your criterion? — Bakuriu
There cannot be an "efficient" algorithm. Take a list like [1, 1, 1, ..., 2, 3, 4, ..., N] with 2N elements. You can put a number n > 1 between each pair of consecutive 1 to obtain a good permutation. Then you permute the N/2 elements and obtain all valid permutations (meaning none is a bad one, but there may be more). The number of such permutations is O(N^2), so you cannot do better than O(N^2). Still better than O(N^3) of the naive approach though. — Bakuriu
@Bakuriu: Two things: (1) to be clear, your example shows that there can be no efficient algorithm for the bonus question. (2) Enumerating all optimal solutions to your example is O((N/2)!), which is much worse than O(N^2) (i.e. your example is much stronger than you realised :-) — j_random_hacker
@msw: I'm making a website, and there is a row with ads blocks from different providers. I want to arrange them so that no blocks from the same provider stand side by side. — georg
I wouldn't say that this is "not even close to a duplicate", but the alleged duplicate is a different question, since the distance between identical elements is considered. People who voted to close after WhyCry's comment: please pay more attention in the future. — David Eisenstat

David Eisenstat David Eisenstat · Accepted Answer · 2014-08-13T15:54:23

This is along the lines of Thijser's currently incomplete pseudocode. The idea is to take the most frequent of the remaining item types unless it was just taken. (See also Coady's implementation of this algorithm.)

import collections
import heapq


class Sentinel:
    pass


def david_eisenstat(lst):
    counts = collections.Counter(lst)
    heap = [(-count, key) for key, count in counts.items()]
    heapq.heapify(heap)
    output = []
    last = Sentinel()
    while heap:
        minuscount1, key1 = heapq.heappop(heap)
        if key1 != last or not heap:
            last = key1
            minuscount1 += 1
        else:
            minuscount2, key2 = heapq.heappop(heap)
            last = key2
            minuscount2 += 1
            if minuscount2 != 0:
                heapq.heappush(heap, (minuscount2, key2))
        output.append(last)
        if minuscount1 != 0:
            heapq.heappush(heap, (minuscount1, key1))
    return output

Proof of correctness

For two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 < k2, 0 defects if k1 = k2, and k1 - k2 - 1 defects if k1 > k2. The = case is obvious. The others are symmetric; each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible.

This greedy algorithm returns optimal solutions, by the following logic. We call a prefix (partial solution) safe if it extends to an optimal solution. Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. It suffices to show inductively that each greedy step maintains safety.

The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. If S extends P' also, then we're done. Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences.

Suppose first that P does not end with y. By the algorithm's choice, x is at least as frequent in Q as y. Consider the maximal substrings of Q containing only x and y. If the first substring has at least as many x's as y's, then it can be rewritten without introducing additional defects to begin with x. If the first substring has more y's than x's, then some other substring has more x's than y's, and we can rewrite these substrings without additional defects so that x goes first. In both cases, we find an optimal solution T that extends P', as needed.

Suppose now that P does end with y. Modify Q by moving the first occurrence of x to the front. In doing so, we introduce at most one defect (where x used to be) and eliminate one defect (the yy).

Generating all solutions

This is tobias_k's answer plus efficient tests to detect when the choice currently under consideration is globally constrained in some way. The asymptotic running time is optimal, since the overhead of generation is on the order of the length of the output. The worst-case delay unfortunately is quadratic; it could be reduced to linear (optimal) with better data structures.

from collections import Counter
from itertools import permutations
from operator import itemgetter
from random import randrange


def get_mode(count):
    return max(count.items(), key=itemgetter(1))[0]


def enum2(prefix, x, count, total, mode):
    prefix.append(x)
    count_x = count[x]
    if count_x == 1:
        del count[x]
    else:
        count[x] = count_x - 1
    yield from enum1(prefix, count, total - 1, mode)
    count[x] = count_x
    del prefix[-1]


def enum1(prefix, count, total, mode):
    if total == 0:
        yield tuple(prefix)
        return
    if count[mode] * 2 - 1 >= total and [mode] != prefix[-1:]:
        yield from enum2(prefix, mode, count, total, mode)
    else:
        defect_okay = not prefix or count[prefix[-1]] * 2 > total
        mode = get_mode(count)
        for x in list(count.keys()):
            if defect_okay or [x] != prefix[-1:]:
                yield from enum2(prefix, x, count, total, mode)


def enum(seq):
    count = Counter(seq)
    if count:
        yield from enum1([], count, sum(count.values()), get_mode(count))
    else:
        yield ()


def defects(lst):
    return sum(lst[i - 1] == lst[i] for i in range(1, len(lst)))


def test(lst):
    perms = set(permutations(lst))
    opt = min(map(defects, perms))
    slow = {perm for perm in perms if defects(perm) == opt}
    fast = set(enum(lst))
    print(lst, fast, slow)
    assert slow == fast


for r in range(10000):
    test([randrange(3) for i in range(randrange(6))])

python - Generate all permutations of a list without adjacent equal elements

12 Answers

Proof of correctness

Generating all solutions

Example