223
votes

One of the basic data structures in Python is the dictionary, which allows one to record "keys" for looking up "values" of any type. Is this implemented internally as a hash table? If not, what is it?

4
If you're interested in the technical details, one article in Beautiful Code deals with the internals of Python's dict implementation.Torsten Marek
That was one of my favorite chapters in Beautiful Code.DGentry
Here is a talk by Brandon Craig Rhodes discussing how python dictionary works, youtube.com/watch?v=C4Kc8xzcA68.chandola
I looked for a diagram representing a dict for a while now, which decipt the implementation in memory and CPython. Thanks for referencing the book!Chen A.

4 Answers

279
votes

Yes, it is a hash mapping or hash table. You can read a description of python's dict implementation, as written by Tim Peters, here.

That's why you can't use something 'not hashable' as a dict key, like a list:

>>> a = {}
>>> b = ['some', 'list']
>>> hash(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable
>>> a[b] = 'some'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable

You can read more about hash tables or check how it has been implemented in python and why it is implemented that way.

41
votes

There must be more to a Python dictionary than a table lookup on hash(). By brute experimentation I found this hash collision:

>>> hash(1.1)
2040142438
>>> hash(4504.1)
2040142438

Yet it doesn't break the dictionary:

>>> d = { 1.1: 'a', 4504.1: 'b' }
>>> d[1.1]
'a'
>>> d[4504.1]
'b'

Sanity check:

>>> for k,v in d.items(): print(hash(k))
2040142438
2040142438

Possibly there's another lookup level beyond hash() that avoids collisions between dictionary keys. Or maybe dict() uses a different hash.

(By the way, this in Python 2.7.10. Same story in Python 3.4.3 and 3.5.0 with a collision at hash(1.1) == hash(214748749.8).)

22
votes

Yes. Internally it is implemented as open hashing based on a primitive polynomial over Z/2 (source).

7
votes

To expand upon nosklo's explanation:

a = {}
b = ['some', 'list']
a[b] = 'some' # this won't work
a[tuple(b)] = 'some' # this will, same as a['some', 'list']