111
votes
s = [1,2,3,4,5,6,7,8,9]
n = 3

zip(*[iter(s)]*n) # returns [(1,2,3),(4,5,6),(7,8,9)]

How does zip(*[iter(s)]*n) work? What would it look like if it was written with more verbose code?

6
also take a look here where how it works is also explained: stackoverflow.com/questions/2202461/… - Matt Joiner
if answers here aren't enough, I blogged it here: telliott99.blogspot.com/2010/01/… - telliott99
Although very intriguing, this technique must go against the core "readability" value of Python! - Demis

6 Answers

116
votes

iter() is an iterator over a sequence. [x] * n produces a list containing n quantity of x, i.e. a list of length n, where each element is x. *arg unpacks a sequence into arguments for a function call. Therefore you're passing the same iterator 3 times to zip(), and it pulls an item from the iterator each time.

x = iter([1,2,3,4,5,6,7,8,9])
print zip(x, x, x)
49
votes

The other great answers and comments explain well the roles of argument unpacking and zip().

As Ignacio and ujukatzel say, you pass to zip() three references to the same iterator and zip() makes 3-tuples of the integers—in order—from each reference to the iterator:

1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9  1,2,3,4,5,6,7,8,9
^                    ^                    ^            
      ^                    ^                    ^
            ^                    ^                    ^

And since you ask for a more verbose code sample:

chunk_size = 3
L = [1,2,3,4,5,6,7,8,9]

# iterate over L in steps of 3
for start in range(0,len(L),chunk_size): # xrange() in 2.x; range() in 3.x
    end = start + chunk_size
    print L[start:end] # three-item chunks

Following the values of start and end:

[0:3) #[1,2,3]
[3:6) #[4,5,6]
[6:9) #[7,8,9]

FWIW, you can get the same result with map() with an initial argument of None:

>>> map(None,*[iter(s)]*3)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

For more on zip() and map(): http://muffinresearch.co.uk/archives/2007/10/16/python-transposing-lists-with-map-and-zip/

32
votes

I think one thing that's missed in all the answers (probably obvious to those familiar with iterators) but not so obvious to others is -

Since we have the same iterator, it gets consumed and the remaining elements are used by the zip. So if we simply used the list and not the iter eg.

l = range(9)
zip(*([l]*3)) # note: not an iter here, the lists are not emptied as we iterate 
# output 
[(0, 0, 0), (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8)]

Using iterator, pops the values and only keeps remaining available, so for zip once 0 is consumed 1 is available and then 2 and so on. A very subtle thing, but quite clever!!!

9
votes

iter(s) returns an iterator for s.

[iter(s)]*n makes a list of n times the same iterator for s.

So, when doing zip(*[iter(s)]*n), it extracts an item from all the three iterators from the list in order. Since all the iterators are the same object, it just groups the list in chunks of n.

6
votes

One word of advice for using zip this way. It will truncate your list if it's length is not evenly divisible. To work around this you could either use itertools.izip_longest if you can accept fill values. Or you could use something like this:

def n_split(iterable, n):
    num_extra = len(iterable) % n
    zipped = zip(*[iter(iterable)] * n)
    return zipped if not num_extra else zipped + [iterable[-num_extra:], ]

Usage:

for ints in n_split(range(1,12), 3):
    print ', '.join([str(i) for i in ints])

Prints:

1, 2, 3
4, 5, 6
7, 8, 9
10, 11
1
votes

It is probably easier to see what is happening in python interpreter or ipython with n = 2:

In [35]: [iter("ABCDEFGH")]*2
Out[35]: [<iterator at 0x6be4128>, <iterator at 0x6be4128>]

So, we have a list of two iterators which are pointing to the same iterator object. Remember that iter on a object returns an iterator object and in this scenario, it is the same iterator twice due to the *2 python syntactic sugar. Iterators also run only once.

Further, zip takes any number of iterables (sequences are iterables) and creates tuple from i'th element of each of the input sequences. Since both iterators are identical in our case, zip moves the same iterator twice for each 2-element tuple of output.

In [41]: help(zip)
Help on built-in function zip in module __builtin__:

zip(...)
    zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]

    Return a list of tuples, where each tuple contains the i-th element
    from each of the argument sequences.  The returned list is truncated
    in length to the length of the shortest argument sequence.

The unpacking (*) operator ensures that the iterators run to exhaustion which in this case is until there is not enough input to create a 2-element tuple.

This can be extended to any value of n and zip(*[iter(s)]*n) works as described.