208
votes

What must I do to use my objects of a custom type as keys in a Python dictionary (where I don't want the "object id" to act as the key) , e.g.

class MyThing:
    def __init__(self,name,location,length):
            self.name = name
            self.location = location
            self.length = length

I'd want to use MyThing's as keys that are considered the same if name and location are the same. From C#/Java I'm used to having to override and provide an equals and hashcode method, and promise not to mutate anything the hashcode depends on.

What must I do in Python to accomplish this ? Should I even ?

(In a simple case, like here, perhaps it'd be better to just place a (name,location) tuple as key - but consider I'd want the key to be an object)

3
What's wrong with using the hash?Rafe Kettler
Probably because he wants two MyThing, if they have the same name and location, to index the dictionary to return the same value, even if they were created separately as two different "objects".Santa
"perhaps it'd be better to just place a (name,location) tuple as key - but consider I'd want the key to be an object)" You mean: a NON-COMPOSITE object ?eyquem

3 Answers

251
votes

You need to add 2 methods, note __hash__ and __eq__:

class MyThing:
    def __init__(self,name,location,length):
        self.name = name
        self.location = location
        self.length = length

    def __hash__(self):
        return hash((self.name, self.location))

    def __eq__(self, other):
        return (self.name, self.location) == (other.name, other.location)

    def __ne__(self, other):
        # Not strictly necessary, but to avoid having both x==y and x!=y
        # True at the same time
        return not(self == other)

The Python dict documentation defines these requirements on key objects, i.e. they must be hashable.

35
votes

An alternative in Python 2.6 or above is to use collections.namedtuple() -- it saves you writing any special methods:

from collections import namedtuple
MyThingBase = namedtuple("MyThingBase", ["name", "location"])
class MyThing(MyThingBase):
    def __new__(cls, name, location, length):
        obj = MyThingBase.__new__(cls, name, location)
        obj.length = length
        return obj

a = MyThing("a", "here", 10)
b = MyThing("a", "here", 20)
c = MyThing("c", "there", 10)
a == b
# True
hash(a) == hash(b)
# True
a == c
# False
20
votes

You override __hash__ if you want special hash-semantics, and __cmp__ or __eq__ in order to make your class usable as a key. Objects who compare equal need to have the same hash value.

Python expects __hash__ to return an integer, returning Banana() is not recommended :)

User defined classes have __hash__ by default that calls id(self), as you noted.

There is some extra tips from the documentation.:

Classes which inherit a __hash__() method from a parent class but change the meaning of __cmp__() or __eq__() such that the hash value returned is no longer appropriate (e.g. by switching to a value-based concept of equality instead of the default identity based equality) can explicitly flag themselves as being unhashable by setting __hash__ = None in the class definition. Doing so means that not only will instances of the class raise an appropriate TypeError when a program attempts to retrieve their hash value, but they will also be correctly identified as unhashable when checking isinstance(obj, collections.Hashable) (unlike classes which define their own __hash__() to explicitly raise TypeError).