Following M. O'Neill's great paper, I tried implementing some lazy, infinite versions of the Sieve of Eratosthenes in Python. I was surprised to find that the heap-based version, that the paper claims should run faster, was actually over two times slower for me.
The paper contains two examples, one based on a dict, which I have translated (from Haskell) thus:
from itertools import count
def dict_sieve():
yield 2
yield 3
candidates = count(5, 2)
composites = {9:{3}} # map composites to their prime factors
for candidate in candidates:
try:
factors = composites.pop(candidate)
except KeyError: # if it's not in the dict, it's prime
yield candidate
composites[candidate**2] = {candidate} # Euler's optimization: start from prime**2
else:
for prime in factors: # go through the prime factors and increment their keys
try:
composites[candidate+prime*2].add(prime) # use prime*2 because we are ignoring evens
except KeyError:
composites[candidate+prime*2] = {prime}
The second example in the paper demonstrates use of a priority queue as data structure. It also uses lazy lists, rather than a simple increment, which I have not done in the interests of a fair test. (Besides, I was using instances of itertools.count
for my lazy lists and I found it to run slightly slower).
from itertools import count
from heapq import heappush, heapreplace
def heap_sieve():
yield 2
yield 3
candidates = count(5,2)
composites = [(9, 3)] # a priority queue of composite/factor pairs, keyed by composite
for candidate in candidates:
prime_flag = True
while composites[0][0] == candidate: # loop because there may be duplicates
prime_flag = False # signal to the if statement below
composite, prime = composites[0]
heapreplace(composites, (composite + prime*2, prime))
if prime_flag:
yield candidate
heappush(composites, (candidate**2, candidate))
I timed the two, along with an 'eager' version, not reproduced here, which produces a list of all primes below a limit:
In [44]: from itertools import islice
In [45]: %timeit list(islice(dict_sieve(), 100000))
...: %timeit list(islice(heap_sieve(), 100000))
...: %timeit eager_sieve(1299710) # 1299709 is the 100,000th prime
...:
1 loops, best of 3: 2.12 s per loop
1 loops, best of 3: 4.94 s per loop
1 loops, best of 3: 677 ms per loop
It's not surprising that the 'eager' version is much quicker - it's basically a trade-off between memory use, the inconvenience of having to specify an upper limit, and CPU time. However, I did find it surprising that the heapq
version is much slower, when the paper claims it to be more efficient. Is it a problem with my implementation? Or is it simply that, as we all know, dicts are super-fast (and heapq
is comparatively slow)?