115
votes

Consider the following expressions. Note that some expressions are repeated to present the "context".

(this is a long list)

a, b = 1, 2                          # simple sequence assignment
a, b = ['green', 'blue']             # list asqignment
a, b = 'XY'                          # string assignment
a, b = range(1,5,2)                  # any iterable will do


                                     # nested sequence assignment

(a,b), c = "XY", "Z"                 # a = 'X', b = 'Y', c = 'Z' 

(a,b), c = "XYZ"                     # ERROR -- too many values to unpack
(a,b), c = "XY"                      # ERROR -- need more than 1 value to unpack

(a,b), c, = [1,2],'this'             # a = '1', b = '2', c = 'this'
(a,b), (c,) = [1,2],'this'           # ERROR -- too many values to unpack


                                     # extended sequence unpacking

a, *b = 1,2,3,4,5                    # a = 1, b = [2,3,4,5]
*a, b = 1,2,3,4,5                    # a = [1,2,3,4], b = 5
a, *b, c = 1,2,3,4,5                 # a = 1, b = [2,3,4], c = 5

a, *b = 'X'                          # a = 'X', b = []
*a, b = 'X'                          # a = [], b = 'X'
a, *b, c = "XY"                      # a = 'X', b = [], c = 'Y'
a, *b, c = "X...Y"                   # a = 'X', b = ['.','.','.'], c = 'Y'

a, b, *c = 1,2,3                     # a = 1, b = 2, c = [3]
a, b, c, *d = 1,2,3                  # a = 1, b = 2, c = 3, d = []

a, *b, c, *d = 1,2,3,4,5             # ERROR -- two starred expressions in assignment

(a,b), c = [1,2],'this'              # a = '1', b = '2', c = 'this'
(a,b), *c = [1,2],'this'             # a = '1', b = '2', c = ['this']

(a,b), c, *d = [1,2],'this'          # a = '1', b = '2', c = 'this', d = []
(a,b), *c, d = [1,2],'this'          # a = '1', b = '2', c = [], d = 'this'

(a,b), (c, *d) = [1,2],'this'        # a = '1', b = '2', c = 't', d = ['h', 'i', 's']

*a = 1                               # ERROR -- target must be in a list or tuple
*a = (1,2)                           # ERROR -- target must be in a list or tuple
*a, = (1,2)                          # a = [1,2]
*a, = 1                              # ERROR -- 'int' object is not iterable
*a, = [1]                            # a = [1]
*a = [1]                             # ERROR -- target must be in a list or tuple
*a, = (1,)                           # a = [1]
*a, = (1)                            # ERROR -- 'int' object is not iterable

*a, b = [1]                          # a = [], b = 1
*a, b = (1,)                         # a = [], b = 1

(a,b),c = 1,2,3                      # ERROR -- too many values to unpack
(a,b), *c = 1,2,3                    # ERROR - 'int' object is not iterable
(a,b), *c = 'XY', 2, 3               # a = 'X', b = 'Y', c = [2,3]


                                     # extended sequence unpacking -- NESTED

(a,b),c = 1,2,3                      # ERROR -- too many values to unpack
*(a,b), c = 1,2,3                    # a = 1, b = 2, c = 3

*(a,b) = 1,2                         # ERROR -- target must be in a list or tuple
*(a,b), = 1,2                        # a = 1, b = 2

*(a,b) = 'XY'                        # ERROR -- target must be in a list or tuple
*(a,b), = 'XY'                       # a = 'X', b = 'Y'

*(a, b) = 'this'                     # ERROR -- target must be in a list or tuple
*(a, b), = 'this'                    # ERROR -- too many values to unpack
*(a, *b), = 'this'                   # a = 't', b = ['h', 'i', 's']

*(a, *b), c = 'this'                 # a = 't', b = ['h', 'i'], c = 's'

*(a,*b), = 1,2,3,3,4,5,6,7           # a = 1, b = [2, 3, 3, 4, 5, 6, 7]

*(a,*b), *c = 1,2,3,3,4,5,6,7        # ERROR -- two starred expressions in assignment
*(a,*b), (*c,) = 1,2,3,3,4,5,6,7     # ERROR -- 'int' object is not iterable
*(a,*b), c = 1,2,3,3,4,5,6,7         # a = 1, b = [2, 3, 3, 4, 5, 6], c = 7
*(a,*b), (*c,) = 1,2,3,4,5,'XY'      # a = 1, b = [2, 3, 4, 5], c = ['X', 'Y']

*(a,*b), c, d = 1,2,3,3,4,5,6,7      # a = 1, b = [2, 3, 3, 4, 5], c = 6, d = 7
*(a,*b), (c, d) = 1,2,3,3,4,5,6,7    # ERROR -- 'int' object is not iterable
*(a,*b), (*c, d) = 1,2,3,3,4,5,6,7   # ERROR -- 'int' object is not iterable
*(a,*b), *(c, d) = 1,2,3,3,4,5,6,7   # ERROR -- two starred expressions in assignment


*(a,b), c = 'XY', 3                  # ERROR -- need more than 1 value to unpack
*(*a,b), c = 'XY', 3                 # a = [], b = 'XY', c = 3
(a,b), c = 'XY', 3                   # a = 'X', b = 'Y', c = 3

*(a,b), c = 'XY', 3, 4               # a = 'XY', b = 3, c = 4
*(*a,b), c = 'XY', 3, 4              # a = ['XY'], b = 3, c = 4
(a,b), c = 'XY', 3, 4                # ERROR -- too many values to unpack

How to correctly deduce the result of such expressions by hand?

4
Honestly, most of these are far more complicated than what you see in code everyday. Learn the basics of unpacking lists/tuples and you'll be fine.Rafe Kettler
Note that these are recursive. So, if you undersand the first few, you can handle everything. Try replacing, e.g., *( *a, b) with *x, figure out what x unpacks and then plug (*a, b) back in for x, etc.Peteris
@greengit I consider myself to have an advanced knowledge of Python and I just know the general rules :) You don't have to know every corner case, you just sometimes need to fire up an interpreter and test something.Rafe Kettler
Wow great list. I really didn't know about the a, *b = 1, 2, 3 kind of unpacking. But this is Py3k right ?Niklas R

4 Answers

121
votes

My apologies for the length of this post, but I decided to opt for completeness.

Once you know a few basic rules, it's not hard to generalize them. I'll do my best to explain with a few examples. Since you're talking about evaluating these "by hand," I'll suggest some simple substitution rules. Basically, you might find it easier to understand an expression if all the iterables are formatted in the same way.

For the purposes of unpacking only, the following substitutions are valid on the right side of the = (i.e. for rvalues):

'XY' -> ('X', 'Y')
['X', 'Y'] -> ('X', 'Y')

If you find that a value doesn't get unpacked, then you'll undo the substitution. (See below for further explanation.)

Also, when you see "naked" commas, pretend there's a top-level tuple. Do this on both the left and the right side (i.e. for lvalues and rvalues):

'X', 'Y' -> ('X', 'Y')
a, b -> (a, b)

With those simple rules in mind, here are some examples:

(a,b), c = "XY", "Z"                 # a = 'X', b = 'Y', c = 'Z'

Applying the above rules, we convert "XY" to ('X', 'Y'), and cover the naked commas in parens:

((a, b), c) = (('X', 'Y'), 'Z')

The visual correspondence here makes it fairly obvious how the assignment works.

Here's an erroneous example:

(a,b), c = "XYZ"

Following the above substitution rules, we get the below:

((a, b), c) = ('X', 'Y', 'Z')

This is clearly erroneous; the nested structures don't match up. Now let's see how it works for a slightly more complex example:

(a,b), c, = [1,2],'this'             # a = '1', b = '2', c = 'this'

Applying the above rules, we get

((a, b), c) = ((1, 2), ('t', 'h', 'i', 's'))

But now it's clear from the structure that 'this' won't be unpacked, but assigned directly to c. So we undo the substitution.

((a, b), c) = ((1, 2), 'this')

Now let's see what happens when we wrap c in a tuple:

(a,b), (c,) = [1,2],'this'           # ERROR -- too many values to unpack

Becomes

((a, b), (c,)) = ((1, 2), ('t', 'h', 'i', 's'))

Again, the error is obvious. c is no longer a naked variable, but a variable inside a sequence, and so the corresponding sequence on the right is unpacked into (c,). But the sequences have a different length, so there's an error.

Now for extended unpacking using the * operator. This is a bit more complex, but it's still fairly straightforward. A variable preceded by * becomes a list, which contains any items from the corresponding sequence that aren't assigned to variable names. Starting with a fairly simple example:

a, *b, c = "X...Y"                   # a = 'X', b = ['.','.','.'], c = 'Y'

This becomes

(a, *b, c) = ('X', '.', '.', '.', 'Y')

The simplest way to analyze this is to work from the ends. 'X' is assigned to a and 'Y' is assigned to c. The remaining values in the sequence are put in a list and assigned to b.

Lvalues like (*a, b) and (a, *b) are just special cases of the above. You can't have two * operators inside one lvalue sequence because it would be ambiguous. Where would the values go in something like this (a, *b, *c, d) -- in b or c? I'll consider the nested case in a moment.

*a = 1                               # ERROR -- target must be in a list or tuple

Here the error is fairly self-explanatory. The target (*a) must be in a tuple.

*a, = (1,2)                          # a = [1,2]

This works because there's a naked comma. Applying the rules...

(*a,) = (1, 2)

Since there are no variables other than *a, *a slurps up all the values in the rvalue sequence. What if you replace the (1, 2) with a single value?

*a, = 1                              # ERROR -- 'int' object is not iterable

becomes

(*a,) = 1

Again, the error here is self-explanatory. You can't unpack something that isn't a sequence, and *a needs something to unpack. So we put it in a sequence

*a, = [1]                            # a = [1]

Which is eqivalent to

(*a,) = (1,)

Finally, this is a common point of confusion: (1) is the same as 1 -- you need a comma to distinguish a tuple from an arithmetic statement.

*a, = (1)                            # ERROR -- 'int' object is not 

Now for nesting. Actually this example wasn't in your "NESTED" section; perhaps you didn't realize it was nested?

(a,b), *c = 'XY', 2, 3               # a = 'X', b = 'Y', c = [2,3]

Becomes

((a, b), *c) = (('X', 'Y'), 2, 3)

The first value in the top-level tuple gets assigned, and the remaining values in the top-level tuple (2 and 3) are assigned to c -- just as we should expect.

(a,b),c = 1,2,3                      # ERROR -- too many values to unpack
*(a,b), c = 1,2,3                    # a = 1, b = 2, c = 3

I've already explained above why the first line throws an error. The second line is silly but here's why it works:

(*(a, b), c) = (1, 2, 3)

As previously explained, we work from the ends. 3 is assigned to c, and then the remaining values are assigned to the variable with the * preceding it, in this case, (a, b). So that's equivalent to (a, b) = (1, 2), which happens to work because there are the right number of elements. I can't think of any reason this would ever appear in working code. Similarly,

*(a, *b), c = 'this'                 # a = 't', b = ['h', 'i'], c = 's'

becomes

(*(a, *b), c) = ('t', 'h', 'i', 's')

Working from the ends, 's' is assigned to c, and ('t', 'h', 'i') is assigned to (a, *b). Working again from the ends, 't' is assigned to a, and ('h', 'i') is assigned to b as a list. This is another silly example that should never appear in working code.

7
votes

I find the Python 2 tuple unpacking pretty straightforward. Each name on the left corresponds with either an entire sequence or a single item in a sequence on the right. If names correspond to single items of any sequence, then there must be enough names to cover all of the items.

Extended unpacking, however, can certainly be confusing, because it is so powerful. The reality is you should never be doing the last 10 or more valid examples you gave -- if the data is that structured, it should be in a dict or a class instance, not unstructured forms like lists.

Clearly, the new syntax can be abused. The answer to your question is that you shouldn't have to read expressions like that -- they're bad practice and I doubt they'll be used.

Just because you can write arbitrarily complex expressions doesn't mean you should. You could write code like map(map, iterable_of_transformations, map(map, iterable_of_transformations, iterable_of_iterables_of_iterables)) but you don't.

3
votes

I you think your code may be misleading use other form to express it.

It's like using extra brackets in expressions to avoid questions about operators precedence. I'ts always a good investment to make your code readable.

I prefer to use unpacking only for simple tasks like swap.

0
votes

The original idea of starred expression at the lhs is to improve the readability of iterable unpacking as below :

first_param, rest_param, third_param = param[0], param[1:-1], param[-1]

This statment is equvailent to

first_param, *rest_param, third_param = param

In the statement above, starred expression is used to 'catch' all elements that are not assigned to 'mandatory targets' (first_param & third_param in this example)

The use of starred expression at the lhs has the following rules:

  1. at most one starred expression at the lhs, or the unpacking will not be unique
*a,b,*c = range(5) # wrong
*a,b,c = range(5) # right
a,*b,c = range(5) # right
  1. In order to collect 'rest' elements, starred expression must be used with mandatory targets. Trailing comma is used to indicate mandatory targets do not exist
*a = range(5) # wrong
*a, = range(5) # right

I belive if you master these two rules, you can deduce any result of starred expression at the lhs.