56
votes

Possible Duplicate:
Flattening a shallow list in Python
Flatten (an irregular) list of lists in Python

EDIT: The question is not how to do it - this has been discussed in other questions - the question is, which is the fastest method?

I've found solutions before, but I'm wondering what the fastest solution is to flatten lists which contain other lists of arbitrary length.

For example:

[1, 2, [3, 4, [5],[]], [6]]

Would become:

[1,2,3,4,5,6]

There can be infinitely many levels. Some of the list objects can be strings, which mustn't be flattened into their sequential characters in the output list.

3
This answer from a previously asked question should do what you want. (Voting to close as duplicate). - GreenMatt
@GreenMatt, it's only a duplicate if the question is the same, not the answer. It's clear that the other question is for a single level of nesting, and simpler solutions are more appropriate. - Mark Ransom
@Mark Ransom: I thought I'd seen this before, and that was what I found when I searched. Guess I didn't read the question carefully enough. - GreenMatt
Please stop removing the system-generated "Possible duplicate" banner. The second link has the answer to your question. - vaultah
Disagree, the answer might happen to be the same, but the question needs to work for arbitrarily nested lists and so isn't a duplicate question. - Rory

3 Answers

73
votes

Here's a recursive approach that is string friendly:

nests = [1, 2, [3, 4, [5],['hi']], [6, [[[7, 'hello']]]]]

def flatten(container):
    for i in container:
        if isinstance(i, (list,tuple)):
            for j in flatten(i):
                yield j
        else:
            yield i

print list(flatten(nests))

returns:

[1, 2, 3, 4, 5, 'hi', 6, 7, 'hello']

Note, this doesn't make any guarantees for speed or overhead use, but illustrates a recursive solution that hopefully will be helpful.

22
votes

It doesn't have to be recursive. In fact, an iterative solution is often faster because of the overhead involved in function calls. Here's an iterative version I wrote a while back:

def flatten(items, seqtypes=(list, tuple)):
    for i, x in enumerate(items):
        while i < len(items) and isinstance(items[i], seqtypes):
            items[i:i+1] = items[i]
    return items

Haven't tested the performance of this specific implementation, but it is probably not so great because of all the slice assignments, which could end up moving a lot of memory around. Still, don't assume it has to be recursive, or that it's simpler to write it that way.

This implementation does have the advantage of flattening the list "in place" rather than returning a copy, as recursive solutions invariably do. This could be useful when memory is tight. If you want a flattened copy, just pass in a shallow copy of the list you want to flatten:

flatten(mylist)                # flattens existing list
newlist = flatten(mylist[:])   # makes a flattened copy

Also, this algorithm is not limited by the Python recursion limit because it's not recursive. I'm certain this will virtually never come into play, however.

9
votes

This function should be able to quickly flatten nested, iterable containers without using any recursion:

import collections

def flatten(iterable):
    iterator = iter(iterable)
    array, stack = collections.deque(), collections.deque()
    while True:
        try:
            value = next(iterator)
        except StopIteration:
            if not stack:
                return tuple(array)
            iterator = stack.pop()
        else:
            if not isinstance(value, str) \
               and isinstance(value, collections.Iterable):
                stack.append(iterator)
                iterator = iter(value)
            else:
                array.append(value)

After about five years, my opinion on the matter has changed, and this may be even better to use:

def main():
    data = [1, 2, [3, 4, [5], []], [6]]
    print(list(flatten(data)))


def flatten(iterable):
    iterator, sentinel, stack = iter(iterable), object(), []
    while True:
        value = next(iterator, sentinel)
        if value is sentinel:
            if not stack:
                break
            iterator = stack.pop()
        elif isinstance(value, str):
            yield value
        else:
            try:
                new_iterator = iter(value)
            except TypeError:
                yield value
            else:
                stack.append(iterator)
                iterator = new_iterator


if __name__ == '__main__':
    main()