3
votes

I am trying to implement a custom class which returns a different value when called as list(c) or dict(c). However, it is my impression that both list(c) and dict(c) use c.__iter__() under the hood? If that is the case, how can I get different behaviour calling list(c) and dict(c)? I know that it is possible because Python dictionaries and pandas DataFrames have different hevariours.

For example:

class Foo:
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        for key, value in zip(self._keys, self._data):
            yield key, value

Calling dict(c) I get what I want:

>>> f = Foo()
>>> dict(f)
{'a': 10, 'b': 20, 'd': 40, 'e': 50}

However, I can't get list(c) to print out a list of keys (or values), but instead get both:

>>> f = Foo()
>>> list(f)
[('a', 10), ('b', 20), ('d', 30), ('d', 40), ('e', 50)]

The equivalent code for a dictionary is much cleaner:

>>> f = {'a': 10, 'b': 20, 'c': 30, 'd': 40, 'e': 50}
>>> dict(f)
{'a': 10, 'b': 20, 'c': 30, 'd': 40, 'e': 50}
>>> list(f)
['a', 'b', 'c', 'd', 'e']
2
This is because you yield tuples (key,value forms a tuple and the python list function just stores them all in a list).lordingtar
I know, but if I yield one value at a time, dict(c) does not work property.ostrokach
list and dict call the same magic methods; there is no way to make a class return something different depending on what called it. I'd suggest that you make keys() and values() methods to return lists.Novel
You need f to be a Mapping type in order to get that sort of behavior. Inheriting from collections.Mapping might be helpful...mgilson

2 Answers

5
votes

Obviously the __iter__ must only return the keys, otherwise list(f) wouldn't work.

The Python documentation says the following of the dict constructor:

If a positional argument is given and it is a mapping object, a dictionary is created with the same key-value pairs as the mapping object.

Now, the question is what is a "mapping" enough for the dict constructor? DataFrame doesn't inherit from any mapping class, neither is it registered against an abstract base class. It turns out we only need to support the keys method: If the object passed to dict constructor has a method called keys, this is called to provide an iterable of the keys [CPython source]. For each key, the value is fetched by indexing.

I.e. the dict constructor does the logical equivalent of the following:

if hasattr(source, 'keys'):
    for k in source.keys():
        self[k] = source[k]
else:
    self.update(iter(source))

Using this we get

class Foo:
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        return iter(self.keys)

    def __getitem__(self, key):
        idx = self._keys.index(key)
        return self._data[idx]

    def keys(self):
        return self._keys

Testing:

>>> f = Foo()
>>> list(f)
['a', 'b', 'd', 'd', 'e']

>>> dict(f)
{'d': 30, 'e': 50, 'a': 10, 'b': 20}

(As you can see from the code above, there is no need to actually inherit from anything)

However, it is not guaranteed that all mapping constructors behave in the same way - some other might call items - thus the most compatible way would be to implement all of the methods required by collections.abc.Mapping and inherit from it. I.e. it would be enough to do

class Foo(collections.abc.Mapping):
    ...
    def __getitem__(self, key):
        idx = self._keys.index(key)
        return self._data[idx]

    def __iter__(self):
        return iter(self._keys)

    def __len__(self):
        return len(self._keys)
2
votes

@mgilson's comment is correct, this can be accomplished by inheriting from the collections.abc.Mapping class:

class Foo(collections.abc.Mapping):
    def __init__(self):
        self._keys = ['a', 'b', 'd', 'd', 'e']
        self._data = [10, 20, 30, 40, 50]

    def __iter__(self):
        for key in self._keys:
            yield key

    def __getitem__(self, value):
        return self._data[self._keys.index(value)]

    def __len__(self):
        return len(self._keys)
>>> f = Foo()
>>> list(f)
['a', 'b', 'd', 'd', 'e']

>>> dict(f)
{'a': 10, 'b': 20, 'd': 30, 'e': 50}