9
votes

I have 2 dictionaries, dict1 and dict2 which contain the same keys, but different values for the keys. What I want to do is for each dictionary, sort the values from largest to smallest, and then give each value a rank 1-N, 1 being the largest value. From here, I want to get the difference of the ranks for the values in each dictionary for the same key. For example:

dict1 = {a:0.6, b:0.3, c:0.9, d:1.2, e:0.2}
dict2 = {a:1.4, b:7.7, c:9.0, d:2.5, e:2.0}

# sorting by values would look like this:
dict1 = {d:1.2, c:0.9, a:0.6, b:0.3, e:0.2}
dict2 = {c:9.0, b:7.7, d:2.5, e:2.0, a:1.4}

#ranking the values would produce this:
dict1 = {d:1, c:2, a:3, b:4, e:5}
dict2 = {c:1, b:2, d:3, e:4, a:5}

#computing the difference between ranks would be something like this:
diffs = {}
for x in dict1.keys():
    diffs[x] = (dict1[x] - dict2[x])

#diffs would look like this:
diffs[a] = -2
diffs[b] = 2
diffs[c] = 1
diffs[d] = -2
diffs[e] = 1

I know dictionaries are meant to be random and not sortable, but maybe there is a method to put the keys and values into a list? The main challenges I am facing are getting the keys and values sorted by value (largest to smallest) and then changing the value to its respective rank in the sorted list.

4

4 Answers

8
votes

A simple solution for small dicts is

dict1 = {"a":0.6, "b":0.3, "c":0.9, "d":1.2, "e":0.2}
dict2 = {"a":1.4, "b":7.7, "c":9.0, "d":2.5, "e":2.0}
k1 = sorted(dict1, key=dict1.get)
k2 = sorted(dict2, key=dict2.get)
diffs = dict((k, k2.index(k) - k1.index(k)) for k in dict1)

A more efficient, less readable version for larger dicts:

ranks1 = dict(map(reversed, enumerate(sorted(dict1, key=dict1.get))))
ranks2 = dict(map(reversed, enumerate(sorted(dict2, key=dict2.get))))
diffs = dict((k, ranks2[k] - ranks1[k]) for k in dict1)
6
votes

You may be interested in collections.OrderedDict

Here's a sample, my initial thougth is you were also looking for dictionaries with keys ordered by values, things that od1 and od2 are.

d1 = {"a":0.6, "b":0.3, "c":0.9, "d":1.2, "e":0.2}
d2 = {"a":1.4, "b":7.7, "c":9.0, "d":2.5, "e":2.0}

od1 = OrderedDict(sorted(d1.items(), key=lambda t: t[1]))
od2 = OrderedDict(sorted(d2.items(), key=lambda t: t[1]))

k1 = od1.keys()
k2 = od2.keys()

diff = dict((k, n - k2.index(k)) for n, k in enumerate(k1))

If you don't need them then Sven solution is probably faster.

edit: not that faster honestly... (sven.py is his second, more efficient version):

$ cat /tmp/mine.py | time python -m timeit
10000000 loops, best of 3: 0.0842 usec per loop
real    0m 3.69s
user    0m 3.38s
sys 0m 0.03s
$ cat /tmp/sven.py | time python -m timeit
10000000 loops, best of 3: 0.085 usec per loop
real    0m 3.86s
user    0m 3.42s
sys 0m 0.03s

If someone wants to post formatted bigger dicts I'll test them too.

2
votes

What version of python are you using? If 2.7, use OrderedDict.

Per the Python 2.7 docs:

OrderedDict(sorted(d.items(), key=d.get))

If you're using Python 2.4-2.6 you can still use OrderedDict by installing it from pypi here or if you have setuptools, run

easy_install ordereddict
0
votes

A dictionary is not the right data structure to solve this problem. You should convert to sorted lists as soon as possible and produce the dictionary only as the final result. The following sample solution uses iterators and generator expressions where possible, to avoid creating too many (potentially large) helper lists along the way:

def get_ranking(vals):
    '''Return a list of pairs: (key, ranking), sorted by key.'''
    ranking = sorted(((v, k) for k, v in vals.iteritems()), reverse=True)
    return sorted((k, i) for (i, (_v, k)) in enumerate(ranking))

def ranking_diff(rank1, rank2):
    return dict((k, v1 - v2) for (k, v1), (_, v2) in itertools.izip(rank1, rank2))

def get_diffs(dict1, dict2):
    r1 = get_ranking(dict1)
    r2 = get_ranking(dict2)
    return ranking_diff(r1, r2)

print get_diffs(dict1, dict2)
# prints: {'a': -2, 'c': 1, 'b': 2, 'e': 1, 'd': -2}

Please note that this solution assumes that both dicts contain exactly the same keys.