Why is []
faster than list()
?
The biggest reason is that Python treats list()
just like a user-defined function, which means you can intercept it by aliasing something else to list
and do something different (like use your own subclassed list or perhaps a deque).
It immediately creates a new instance of a builtin list with []
.
My explanation seeks to give you the intuition for this.
Explanation
[]
is commonly known as literal syntax.
In the grammar, this is referred to as a "list display". From the docs:
A list display is a possibly empty series of expressions enclosed in
square brackets:
list_display ::= "[" [starred_list | comprehension] "]"
A list display yields a new list object, the contents being specified
by either a list of expressions or a comprehension. When a
comma-separated list of expressions is supplied, its elements are
evaluated from left to right and placed into the list object in that
order. When a comprehension is supplied, the list is constructed from
the elements resulting from the comprehension.
In short, this means that a builtin object of type list
is created.
There is no circumventing this - which means Python can do it as quickly as it may.
On the other hand, list()
can be intercepted from creating a builtin list
using the builtin list constructor.
For example, say we want our lists to be created noisily:
class List(list):
def __init__(self, iterable=None):
if iterable is None:
super().__init__()
else:
super().__init__(iterable)
print('List initialized.')
We could then intercept the name list
on the module level global scope, and then when we create a list
, we actually create our subtyped list:
>>> list = List
>>> a_list = list()
List initialized.
>>> type(a_list)
<class '__main__.List'>
Similarly we could remove it from the global namespace
del list
and put it in the builtin namespace:
import builtins
builtins.list = List
And now:
>>> list_0 = list()
List initialized.
>>> type(list_0)
<class '__main__.List'>
And note that the list display creates a list unconditionally:
>>> list_1 = []
>>> type(list_1)
<class 'list'>
We probably only do this temporarily, so lets undo our changes - first remove the new List
object from the builtins:
>>> del builtins.list
>>> builtins.list
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'builtins' has no attribute 'list'
>>> list()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'list' is not defined
Oh, no, we lost track of the original.
Not to worry, we can still get list
- it's the type of a list literal:
>>> builtins.list = type([])
>>> list()
[]
So...
Why is []
faster than list()
?
As we've seen - we can overwrite list
- but we can't intercept the creation of the literal type. When we use list
we have to do the lookups to see if anything is there.
Then we have to call whatever callable we have looked up. From the grammar:
A call calls a callable object (e.g., a function) with a possibly
empty series of arguments:
call ::= primary "(" [argument_list [","] | comprehension] ")"
We can see that it does the same thing for any name, not just list:
>>> import dis
>>> dis.dis('list()')
1 0 LOAD_NAME 0 (list)
2 CALL_FUNCTION 0
4 RETURN_VALUE
>>> dis.dis('doesnotexist()')
1 0 LOAD_NAME 0 (doesnotexist)
2 CALL_FUNCTION 0
4 RETURN_VALUE
For []
there is no function call at the Python bytecode level:
>>> dis.dis('[]')
1 0 BUILD_LIST 0
2 RETURN_VALUE
It simply goes straight to building the list without any lookups or calls at the bytecode level.
Conclusion
We have demonstrated that list
can be intercepted with user code using the scoping rules, and that list()
looks for a callable and then calls it.
Whereas []
is a list display, or a literal, and thus avoids the name lookup and function call.
()
and''
are special, as they're not only empty, they're immutable, and as such, it's an easy win to make them singletons; they don't even construct new objects, just load the singleton for the emptytuple
/str
. Technically an implementation detail, but I have a hard time imagining why they wouldn't cache the emptytuple
/str
for performance reasons. So your intuition about[]
and{}
passing back a stock literal was wrong, but it does apply to()
and''
. – ShadowRanger{}
faster than callingset()
? – cs95