95
votes

It is a common mistake in Python to set a mutable object as the default value of an argument in a function. Here's an example taken from this excellent write-up by David Goodger:

>>> def bad_append(new_item, a_list=[]):
        a_list.append(new_item)
        return a_list
>>> print bad_append('one')
['one']
>>> print bad_append('two')
['one', 'two']

The explanation why this happens is here.

And now for my question: Is there a good use-case for this syntax?

I mean, if everybody who encounters it makes the same mistake, debugs it, understands the issue and from thereon tries to avoid it, what use is there for such syntax?

7
The best explanation I know for this is in the linked question: functions are first-class objects, just like classes. Classes have mutable attribute data; functions have mutable default values.Katriel
This behavior it is not a "design choice" - it is a result from the way the language works - starting from simple working principles, with as few exceptions as possible. At some point for me, as I started to "think in Python" this behavior just became natural - and I'd be surprised if it did not happenjsbueno
I've wondered this too. This example is all over the web, but it just doesn't make sense - either you want to mutate the passed list and having a default doesn't make sense, or you want to return a new list and you should make a copy immediately upon entering the function. I can't imagine the case where it's useful to do both.Mark Ransom
I just came across a more realistic example that doesn't have the problem I complain about above. The default is an argument to the __init__ function for a class, which gets set into an instance variable; this is a perfectly valid thing to want to do, and it all goes horribly wrong with a mutable default. stackoverflow.com/questions/43768055/…Mark Ransom
@MarkRansom: With your definition, there wouldn't be any bug ever on a (deterministic) computer. Every bug makes sense when you spend enough time grokking the internals. Let's be honest and call this behaviour one of the very few design flaws in Python.Eric Duminil

7 Answers

65
votes

You can use it to cache values between function calls:

def get_from_cache(name, cache={}):
    if name in cache: return cache[name]
    cache[name] = result = expensive_calculation()
    return result

but usually that sort of thing is done better with a class as you can then have additional attributes to clear the cache etc.

14
votes

Canonical answer is this page: http://effbot.org/zone/default-values.htm

It also mentions 3 "good" use cases for mutable default argument:

  • binding local variable to current value of outer variable in a callback
  • cache/memoization
  • local rebinding of global names (for highly optimized code)
13
votes

Maybe you do not mutate the mutable argument, but do expect a mutable argument:

def foo(x, y, config={}):
    my_config = {'debug': True, 'verbose': False}
    my_config.update(config)
    return bar(x, my_config) + baz(y, my_config)

(Yes, I know you can use config=() in this particular case, but I find that less clear and less general.)

12
votes
import random

def ten_random_numbers(rng=random):
    return [rng.random() for i in xrange(10)]

Uses the random module, effectively a mutable singleton, as its default random number generator.

4
votes

EDIT (clarification): The mutable default argument issue is a symptom of a deeper design choice, namely, that default argument values are stored as attributes on the function object. You might ask why this choice was made; as always, such questions are difficult to answer properly. But it certainly has good uses:

Optimising for performance:

def foo(sin=math.sin): ...

Grabbing object values in a closure instead of the variable.

callbacks = []
for i in range(10):
    def callback(i=i): ...
    callbacks.append(callback)
2
votes

I know this is an old one, but just for the heck of it I'd like to add a use case to this thread. I regularly write custom functions and layers for TensorFlow/Keras, upload my scripts to a server, train the models (with custom objects) there, and then save the models and download them. In order to load those models I then need to provide a dictionary containing all of those custom objects.

What you can do in situations like mine is add some code to the module containing those custom objects:

custom_objects = {}

def custom_object(obj, storage=custom_objects):
    storage[obj.__name__] = obj
    return obj

Then, I can just decorate any class/funcion that needs to be in the dictionary

@custom_object
def some_function(x):
    return 3*x*x + 2*x - 2

Moreover, say I want to store my custom loss funcions in a different dictionary than my custom Keras layers. Using functools.partial gives me easy access to a new decorator

import functools
import tf

custom_losses = {}
custom_loss = functools.partial(custom_object, storage=custom_losses)

@custom_loss
def my_loss(y, y_pred):
    return tf.reduce_mean(tf.square(y - y_pred))
-1
votes

In answer to the question of good uses for mutable default argument values, I offer the following example:

A mutable default can be useful for programing easy to use, importable commands of your own creation. The mutable default method amount to having private, static variables in a function that you can initialization on the first call (very much like a class) but without having to resort to globals, without having to use a wrapper, and without having to instantize a class object that was imported. It is in its own way elegant, as I hope you will agree.

Consider these two examples:

def dittle(cache = []):

    from time import sleep # Not needed except as an example.

    # dittle's internal cache list has this format: cache[string, counter]
    # Any argument passed to dittle() that violates this format is invalid.
    # (The string is pure storage, but the counter is used by dittle.)

     # -- Error Trap --
    if type(cache) != list or cache !=[] and (len(cache) == 2 and type(cache[1]) != int):
        print(" User called dittle("+repr(cache)+").\n >> Warning: dittle() takes no arguments, so this call is ignored.\n")
        return

    # -- Initialize Function. (Executes on first call only.) --
    if not cache:
        print("\n cache =",cache)
        print(" Initializing private mutable static cache. Runs only on First Call!")
        cache.append("Hello World!")
        cache.append(0)
        print(" cache =",cache,end="\n\n")
    # -- Normal Operation --
    cache[1]+=1 # Static cycle count.
    outstr = " dittle() called "+str(cache[1])+" times."
    if cache[1] == 1:outstr=outstr.replace("s.",".")
    print(outstr)
    print(" Internal cache held string = '"+cache[0]+"'")
    print()
    if cache[1] == 3:
        print(" Let's rest for a moment.")
        sleep(2.0) # Since we imported it, we might as well use it.
        print(" Wheew! Ready to continue.\n")
        sleep(1.0)
    elif cache[1] == 4:
        cache[0] = "It's Good to be Alive!" # Let's change the private message.

# =================== MAIN ======================        
if __name__ == "__main__":

    for cnt in range(2):dittle() # Calls can be loop-driven, but they need not be.

    print(" Attempting to pass an list to dittle()")
    dittle([" BAD","Data"])
    
    print(" Attempting to pass a non-list to dittle()")
    dittle("hi")
    
    print(" Calling dittle() normally..")
    dittle()
    
    print(" Attempting to set the private mutable value from the outside.")
    # Even an insider's attempt to feed a valid format will be accepted
    # for the one call only, and is then is discarded when it goes out
    # of scope. It fails to interrupt normal operation.
    dittle([" I am a Grieffer!\n (Notice this change will not stick!)",-7]) 
    
    print(" Calling dittle() normally once again.")
    dittle()
    dittle()

If you run this code, you will see that the dittle() function internalizes on the the very first call but not on additional calls, it uses a private static cache (the mutable default) for internal static storage between calls, rejects attempts to hijack the static storage, is resilient to malicious input, and can act based on dynamic conditions (here on the number of times the function has been called.)

The key to using mutable defaults not to do anything what will reassign the variable in memory, but to always change the variable in place.

To really see the potential power and usefulness of this technique, save this first program to your current directory under the name "DITTLE.py", then run the next program. It imports and uses our new dittle() command without requiring any steps to remember or programing hoops to jump through.

Here is our second example. Compile and run this as a new program.

from DITTLE import dittle

print("\n We have emulated a new python command with 'dittle()'.\n")
# Nothing to declare, nothing to instantize, nothing to remember.

dittle()
dittle()
dittle()
dittle()
dittle()

Now isn't that as slick and clean as can be? These mutable defaults can really come in handy.

========================

After reflecting on my answer for a while, I'm not sure that I made the difference between using the mutable default method and the regular way of accomplishing the same thing clear.

The regular way is to use an importable function that wraps a Class object (and uses a global). So for comparison, here a Class-based method that attempts to do the same things as the mutable default method.

from time import sleep

class dittle_class():

    def __init__(self):
        
        self.b = 0
        self.a = " Hello World!"
        
        print("\n Initializing Class Object. Executes on First Call only.")
        print(" self.a = '"+str(self.a),"', self.b =",self.b,end="\n\n")
    
    def report(self):
        self.b  = self.b + 1
        
        if self.b == 1:
            print(" Dittle() called",self.b,"time.")
        else:
            print(" Dittle() called",self.b,"times.")
        
        if self.b == 5:
            self.a = " It's Great to be alive!"
        
        print(" Internal String =",self.a,end="\n\n")
            
        if self.b ==3:
            print(" Let's rest for a moment.")
            sleep(2.0) # Since we imported it, we might as well use it.
            print(" Wheew! Ready to continue.\n")
            sleep(1.0)

cl= dittle_class()

def dittle():
    global cl
    
    if type(cl.a) != str and type(cl.b) != int:
        print(" Class exists but does not have valid format.")
        
    cl.report()

# =================== MAIN ====================== 
if __name__ == "__main__":
    print(" We have emulated a python command with our own 'dittle()' command.\n")
    for cnt in range(2):dittle() # Call can be loop-driver, but they need not be.
    
    print(" Attempting to pass arguments to dittle()")
    try: # The user must catch the fatal error. The mutable default user did not. 
        dittle(["BAD","Data"])
    except:
        print(" This caused a fatal error that can't be caught in the function.\n")
    
    print(" Calling dittle() normally..")
    dittle()
    
    print(" Attempting to set the Class variable from the outside.")
    cl.a = " I'm a griefer. My damage sticks."
    cl.b = -7
    
    dittle()
    dittle()

Save this Class-based program in your current directory as DITTLE.py then run the following code (which is the same as earlier.)

from DITTLE import dittle
# Nothing to declare, nothing to instantize, nothing to remember.

dittle()
dittle()
dittle()
dittle()
dittle()

By comparing the two methods, the advantages of using a mutable default in a function should be clearer. The mutable default method needs no globals, it's internal variables can't be set directly. And while the mutable method accepted a knowledgeable passed argument for a single cycle then shrugged it off, the Class method was permanently altered because its internal variable are directly exposed to the outside. As for which method is easier to program? I think that depends on your comfort level with the methods and the complexity of your goals.