Using filterfalse
without lambda-expression
When using functions like filter
or filterfalse
and similar from itertools
you can usually save performance by avoiding lambda
-expressions and using already existing functions. Instances of list
and set
defines a __contains__
-method to use for containment checks. The in
-operator calls this method under the hood, so using x in l2
can be replaced by l2.__contains__(x)
. Usually this replacement is not really prettier but in this specific case it allows us to gain better performance than using a lambda
-expression, when used in combination with filterfalse
:
>>> from itertools import filterfalse
>>> l1 = [1, 2, 6, 8]
>>> l2 = [2, 3, 5, 8]
>>> list(filterfalse(l2.__contains__, l1))
[1, 6]
filterfalse
creates an iterator yielding all elements that returns false
when used as an argument for l2.__contains__
.
Sets has a faster implementation of __contains__
so even better is:
>>> from itertools import filterfalse
>>> l1 = [1, 2, 6, 8]
>>> l2 = set([2, 3, 5, 8])
>>> list(filterfalse(l2.__contains__, l1))
[1, 6]
Performance
Using list:
$ python3 -m timeit -s "from itertools import filterfalse; l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "list(filterfalse(l2.__contains__, l1))"
500000 loops, best of 5: 522 nsec per loop
Using set:
$ python3 -m timeit -s "from itertools import filterfalse; l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "list(filterfalse(l2.__contains__, l1))"
1000000 loops, best of 5: 359 nsec per loop