48
votes

I've been working on solving Project Euler problems in Clojure to get better, and I've already run into prime number generation a couple of times. My problem is that it is just taking way too long. I was hoping someone could help me find an efficient way to do this in a Clojure-y way.

When I fist did this, I brute-forced it. That was easy to do. But calculating 10001 prime numbers took 2 minutes this way on a Xeon 2.33GHz, too long for the rules, and too long in general. Here was the algorithm:

(defn next-prime-slow
    "Find the next prime number, checking against our already existing list"
    ([sofar guess]
        (if (not-any? #(zero? (mod guess %)) sofar)
            guess                         ; Then we have a prime
            (recur sofar (+ guess 2)))))  ; Try again                               

(defn find-primes-slow
    "Finds prime numbers, slowly"
    ([]
        (find-primes-slow 10001 [2 3]))   ; How many we need, initial prime seeds
    ([needed sofar]
        (if (<= needed (count sofar))
            sofar                         ; Found enough, we're done
            (recur needed (concat sofar [(next-prime-slow sofar (last sofar))])))))

By replacing next-prime-slow with a newer routine that took some additional rules into account (like the 6n +/- 1 property) I was able to speed things up to about 70 seconds.

Next I tried making a sieve of Eratosthenes in pure Clojure. I don't think I got all the bugs out, but I gave up because it was simply way too slow (even worse than the above, I think).

(defn clean-sieve
    "Clean the sieve of what we know isn't prime based"
    [seeds-left sieve]
    (if (zero? (count seeds-left))
        sieve              ; Nothing left to filter the list against
        (recur
            (rest seeds-left)    ; The numbers we haven't checked against
            (filter #(> (mod % (first seeds-left)) 0) sieve)))) ; Filter out multiples

(defn self-clean-sieve  ; This seems to be REALLY slow
    "Remove the stuff in the sieve that isn't prime based on it's self"
    ([sieve]
        (self-clean-sieve (rest sieve) (take 1 sieve)))
    ([sieve clean]
        (if (zero? (count sieve))
            clean
            (let [cleaned (filter #(> (mod % (last clean)) 0) sieve)]
                (recur (rest cleaned) (into clean [(first cleaned)]))))))

(defn find-primes
    "Finds prime numbers, hopefully faster"
    ([]
        (find-primes 10001 [2]))
    ([needed seeds]
        (if (>= (count seeds) needed)
            seeds        ; We have enough
            (recur       ; Recalculate
                needed
                (into
                    seeds    ; Stuff we've already found
                    (let [start (last seeds)
                            end-range (+ start 150000)]   ; NOTE HERE
                        (reverse                                                
                            (self-clean-sieve
                            (clean-sieve seeds (range (inc start) end-range))))))))))

This is bad. It also causes stack overflows if the number 150000 is smaller. This despite the fact I'm using recur. That may be my fault.

Next I tried a sieve, using Java methods on a Java ArrayList. That took quite a bit of time, and memory.

My latest attempt is a sieve using a Clojure hash-map, inserting all the numbers in the sieve then dissoc'ing numbers that aren't prime. At the end, it takes the key list, which are the prime numbers it found. It takes about 10-12 seconds to find 10000 prime numbers. I'm not sure it's fully debugged yet. It's recursive too (using recur and loop), since I'm trying to be Lispy.

So with these kind of problems, problem 10 (sum up all primes under 2000000) is killing me. My fastest code came up with the right answer, but it took 105 seconds to do it, and needed quite a bit of memory (I gave it 512 MB just so I wouldn't have to fuss with it). My other algorithms take so long I always ended up stopping them first.

I could use a sieve to calculate that many primes in Java or C quite fast and without using so much memory. I know I must be missing something in my Clojure/Lisp style that's causing the problem.

Is there something I'm doing really wrong? Is Clojure just kinda slow with large sequences? Reading some of the project Euler discussions people have calculated the first 10000 primes in other Lisps in under 100 miliseconds. I realize the JVM may slow things down and Clojure is relatively young, but I wouldn't expect a 100x difference.

Can someone enlighten me on a fast way to calculate prime numbers in Clojure?

16
Are you trying to generate lots of primes, large primes? Test primality? What's the goal?JP Alioto
I was looking for a general algorithm. Partly this is just to improve my understanding of the language. One problem asked for the 10001st prime, one for the sum of all under 2000000. I expect there will be more. My algorithms above are all targeted at generating primes in order.MBCook
Not an answer, but something I found interesting ... bigdingus.com/2008/07/01/finding-primes-with-erlang-and-clojureJP Alioto
I had the same problem with Project Euler and Haskell, though not of the same magnitude. I'd implement the same algorithm in C and Haskell, and the C program would take a half second whereas Haskell took thirty. This is mostly due to the fact that I don't really know how to add strictness to Haskell, as some algorithms take about equal times in both languages.Dietrich Epp
Check Alex Martelli's Python version: stackoverflow.com/questions/2068372/… The difference is that one would not know how many numbers will be asked for in advance.Hamish Grubijan

16 Answers

29
votes

Here's another approach that celebrates Clojure's Java interop. This takes 374ms on a 2.4 Ghz Core 2 Duo (running single-threaded). I let the efficient Miller-Rabin implementation in Java's BigInteger#isProbablePrime deal with the primality check.

(def certainty 5)

(defn prime? [n]
      (.isProbablePrime (BigInteger/valueOf n) certainty))

(concat [2] (take 10001 
   (filter prime? 
      (take-nth 2 
         (range 1 Integer/MAX_VALUE)))))

The Miller-Rabin certainty of 5 is probably not very good for numbers much larger than this. That certainty is equal to 96.875% certain it's prime (1 - .5^certainty)

21
votes

I realize this is a very old question, but I recently ended up looking for the same and the links here weren't what I'm looking for (restricted to functional types as much as possible, lazily generating ~every~ prime I want).

I stumbled upon a nice F# implementation, so all credits are his. I merely ported it to Clojure:

(defn gen-primes "Generates an infinite, lazy sequence of prime numbers"
  []
  (letfn [(reinsert [table x prime]
             (update-in table [(+ prime x)] conj prime))
          (primes-step [table d]
             (if-let [factors (get table d)]
               (recur (reduce #(reinsert %1 d %2) (dissoc table d) factors)
                      (inc d))
               (lazy-seq (cons d (primes-step (assoc table (* d d) (list d))
                                              (inc d))))))]
    (primes-step {} 2)))

Usage is simply

(take 5 (gen-primes))    
12
votes

Very late to the party, but I'll throw in an example, using Java BitSets:

(defn sieve [n]
  "Returns a BitSet with bits set for each prime up to n"
  (let [bs (new java.util.BitSet n)]
    (.flip bs 2 n)
    (doseq [i (range 4 n 2)] (.clear bs i))
    (doseq [p (range 3 (Math/sqrt n))]
      (if (.get bs p)
        (doseq [q (range (* p p) n (* 2 p))] (.clear bs q))))
    bs))

Running this on a 2014 Macbook Pro (2.3GHz Core i7), I get:

user=> (time (do (sieve 1e6) nil))
"Elapsed time: 64.936 msecs"
11
votes

See the last example here: http://clojuredocs.org/clojure_core/clojure.core/lazy-seq

;; An example combining lazy sequences with higher order functions
;; Generate prime numbers using Eratosthenes Sieve
;; See http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
;; Note that the starting set of sieved numbers should be
;; the set of integers starting with 2 i.e., (iterate inc 2) 
(defn sieve [s]
  (cons (first s)
        (lazy-seq (sieve (filter #(not= 0 (mod % (first s)))
                                 (rest s))))))

user=> (take 20 (sieve (iterate inc 2)))
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)
4
votes

Here's a nice and simple implementation:

http://clj-me.blogspot.com/2008/06/primes.html

... but it is written for some pre-1.0 version of Clojure. See lazy_seqs in Clojure Contrib for one that works with the current version of the language.

3
votes
(defn sieve
  [[p & rst]]
  ;; make sure the stack size is sufficiently large!
  (lazy-seq (cons p (sieve (remove #(= 0 (mod % p)) rst)))))

(def primes (sieve (iterate inc 2)))

with a 10M stack size, I get the 1001th prime in ~ 33 seconds on a 2.1Gz macbook.

3
votes

So I've just started with Clojure, and yeah, this comes up a lot on Project Euler doesn't it? I wrote a pretty fast trial division prime algorithm, but it doesn't really scale too far before each run of divisions becomes prohibitively slow.

So I started again, this time using the sieve method:

(defn clense
  "Walks through the sieve and nils out multiples of step"
  [primes step i]
  (if (<= i (count primes))
    (recur 
      (assoc! primes i nil)
      step
      (+ i step))
    primes))

(defn sieve-step
  "Only works if i is >= 3"
  [primes i]
  (if (< i (count primes))
    (recur
      (if (nil? (primes i)) primes (clense primes (* 2 i) (* i i)))
      (+ 2 i))
    primes))

(defn prime-sieve
  "Returns a lazy list of all primes smaller than x"
  [x]
  (drop 2 
    (filter (complement nil?)
    (persistent! (sieve-step 
      (clense (transient (vec (range x))) 2 4) 3)))))

Usage and speed:

user=> (time (do (prime-sieve 1E6) nil))
"Elapsed time: 930.881 msecs

I'm pretty happy with the speed: it's running out of a REPL running on a 2009 MBP. It's mostly fast because I completely eschew idiomatic Clojure and instead loop around like a monkey. It's also 4X faster because I'm using a transient vector to work on the sieve instead of staying completely immutable.

Edit: After a couple of suggestions / bug fixes from Will Ness it now runs a whole lot faster.

2
votes

Here's a simple sieve in Scheme:

http://telegraphics.com.au/svn/puzzles/trunk/programming-in-scheme/primes-up-to.scm

Here's a run for primes up to 10,000:

#;1> (include "primes-up-to.scm")
; including primes-up-to.scm ...
#;2> ,t (primes-up-to 10000)
0.238s CPU time, 0.062s GC time (major), 180013 mutations, 130/4758 GCs (major/minor)
(2 3 5 7 11 13...
2
votes

Here is a Clojure solution. i is the current number being considered and p is a list of all prime numbers found so far. If division by some prime numbers has a remainder of zero, the number i is not a prime number and recursion occurs with the next number. Otherwise the prime number is added to p in the next recursion (as well as continuing with the next number).

(defn primes [i p]
  (if (some #(zero? (mod i %)) p)
    (recur (inc i) p)
    (cons i (lazy-seq (primes (inc i) (conj p i))))))
(time (do (doall (take 5001 (primes 2 []))) nil))
; Elapsed time: 2004.75587 msecs
(time (do (doall (take 10001 (primes 2 []))) nil))
; Elapsed time: 7700.675118 msecs

Update: Here is a much slicker solution based on this answer above. Basically the list of integers starting with two is filtered lazily. Filtering is performed by only accepting a number i if there is no prime number dividing the number with remainder of zero. All prime numbers are tried where the square of the prime number is less or equal to i. Note that primes is used recursively but Clojure manages to prevent endless recursion. Also note that the lazy sequence primes caches results (that's why the performance results are a bit counter intuitive at first sight).

(def primes
  (lazy-seq
    (filter (fn [i] (not-any? #(zero? (rem i %))
                              (take-while #(<= (* % %) i) primes)))
            (drop 2 (range)))))
(time (first (drop 10000 primes)))
; Elapsed time: 542.204211 msecs
(time (first (drop 20000 primes)))
; Elapsed time: 786.667644 msecs
(time (first (drop 40000 primes)))
; Elapsed time: 1780.15807 msecs
(time (first (drop 40000 primes)))
; Elapsed time: 8.415643 msecs
1
votes

Based on Will's comment, here is my take on postponed-primes:

(defn postponed-primes-recursive
  ([]
     (concat (list 2 3 5 7)
             (lazy-seq (postponed-primes-recursive
                        {}
                        3
                        9
                        (rest (rest (postponed-primes-recursive)))
                        9))))
  ([D p q ps c]
     (letfn [(add-composites
               [D x s]
               (loop [a x]
                 (if (contains? D a)
                   (recur (+ a s))
                   (persistent! (assoc! (transient D) a s)))))]
       (loop [D D
              p p
              q q
              ps ps
              c c]
         (if (not (contains? D c))
           (if (< c q)
             (cons c (lazy-seq (postponed-primes-recursive D p q ps (+ 2 c))))
             (recur (add-composites D
                                    (+ c (* 2 p))
                                    (* 2 p))
                    (first ps)
                    (* (first ps) (first ps))
                    (rest ps)
                    (+ c 2)))
           (let [s (get D c)]
             (recur (add-composites
                     (persistent! (dissoc! (transient D) c))
                     (+ c s)
                     s)
                    p
                    q
                    ps
                    (+ c 2))))))))

Initial submission for comparison:

Here is my attempt to port this prime number generator from Python to Clojure. The below returns an infinite lazy sequence.

(defn primes
  []
  (letfn [(prime-help
            [foo bar]
            (loop [D foo
                   q bar]
              (if (nil? (get D q))
                (cons q (lazy-seq
                         (prime-help
                          (persistent! (assoc! (transient D) (* q q) (list q)))
                          (inc q))))
                (let [factors-of-q (get D q)
                      key-val (interleave
                               (map #(+ % q) factors-of-q)
                               (map #(cons % (get D (+ % q) (list)))
                                    factors-of-q))]
                  (recur (persistent!
                          (dissoc!
                           (apply assoc! (transient D) key-val)
                           q))
                         (inc q))))))]
    (prime-help {} 2)))

Usage:

user=> (first (primes))
2
user=> (second (primes))
3
user=> (nth (primes) 100)
547
user=> (take 5 (primes))
(2 3 5 7 11)
user=> (time (nth (primes) 10000))
"Elapsed time: 409.052221 msecs"
104743

edit:

Performance comparison, where postponed-primes uses a queue of primes seen so far rather than a recursive call to postponed-primes:

user=> (def counts (list 200000 400000 600000 800000))
#'user/counts
user=> (map #(time (nth (postponed-primes) %)) counts)
("Elapsed time: 1822.882 msecs"
 "Elapsed time: 3985.299 msecs"
 "Elapsed time: 6916.98 msecs"
 "Elapsed time: 8710.791 msecs"
2750161 5800139 8960467 12195263)
user=> (map #(time (nth (postponed-primes-recursive) %)) counts)
("Elapsed time: 1776.843 msecs"
 "Elapsed time: 3874.125 msecs"
 "Elapsed time: 6092.79 msecs"
 "Elapsed time: 8453.017 msecs"
2750161 5800139 8960467 12195263)
1
votes

Idiomatic, and not too bad

(def primes
  (cons 1 (lazy-seq
            (filter (fn [i]
                      (not-any? (fn [p] (zero? (rem i p)))
                                (take-while #(<= % (Math/sqrt i))
                                            (rest primes))))
                    (drop 2 (range))))))
=> #'user/primes
(first (time (drop 10000 primes)))
"Elapsed time: 0.023135 msecs"
=> 104729
0
votes

From: http://steloflute.tistory.com/entry/Clojure-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%A8-%EC%B5%9C%EC%A0%81%ED%99%94

Using Java array

(defmacro loopwhile [init-symbol init whilep step & body]
  `(loop [~init-symbol ~init]
     (when ~whilep ~@body (recur (+ ~init-symbol ~step)))))

(defn primesUnderb [limit]
  (let [p (boolean-array limit true)]
    (loopwhile i 2 (< i (Math/sqrt limit)) 1
               (when (aget p i)
                 (loopwhile j (* i 2) (< j limit) i (aset p j false))))
    (filter #(aget p %) (range 2 limit))))

Usage and speed:

user=> (time (def p (primesUnderb 1e6)))
"Elapsed time: 104.065891 msecs"
0
votes

After coming to this thread and searching for a faster alternative to those already here, I am surprised nobody linked to the following article by Christophe Grand :

(defn primes3 [max]
  (let [enqueue (fn [sieve n factor]
                  (let [m (+ n (+ factor factor))]
                    (if (sieve m)
                      (recur sieve m factor)
                      (assoc sieve m factor))))
        next-sieve (fn [sieve candidate]
                     (if-let [factor (sieve candidate)]
                       (-> sieve
                         (dissoc candidate)
                         (enqueue candidate factor))
                       (enqueue sieve candidate candidate)))]
    (cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))

As well as a lazy version :

(defn lazy-primes3 []
  (letfn [(enqueue [sieve n step]
            (let [m (+ n step)]
              (if (sieve m)
                (recur sieve m step)
                (assoc sieve m step))))
          (next-sieve [sieve candidate]
            (if-let [step (sieve candidate)]
              (-> sieve
                (dissoc candidate)
                (enqueue candidate step))
              (enqueue sieve candidate (+ candidate candidate))))
          (next-primes [sieve candidate]
            (if (sieve candidate)
              (recur (next-sieve sieve candidate) (+ candidate 2))
              (cons candidate 
                (lazy-seq (next-primes (next-sieve sieve candidate) 
                            (+ candidate 2))))))]
    (cons 2 (lazy-seq (next-primes {} 3)))))
0
votes

Plenty of answers already, but I have an alternative solution which generates an infinite sequence of primes. I was also interested on bechmarking a few solutions.

First some Java interop. for reference:

(defn prime-fn-1 [accuracy]
  (cons 2
    (for [i (range)
          :let [prime-candidate (-> i (* 2) (+ 3))]
          :when (.isProbablePrime (BigInteger/valueOf prime-candidate) accuracy)]
      prime-candidate)))

Benjamin @ https://stackoverflow.com/a/7625207/3731823 is primes-fn-2

nha @ https://stackoverflow.com/a/36432061/3731823 is primes-fn-3

My implementations is primes-fn-4:

(defn primes-fn-4 []
  (let [primes-with-duplicates
         (->> (for [i (range)] (-> i (* 2) (+ 5))) ; 5, 7, 9, 11, ...
              (reductions
                (fn [known-primes candidate]
                  (if (->> known-primes
                           (take-while #(<= (* % %) candidate))
                           (not-any?   #(-> candidate (mod %) zero?)))
                   (conj known-primes candidate)
                   known-primes))
                [3])     ; Our initial list of known odd primes
              (cons [2]) ; Put in the non-odd one
              (map (comp first rseq)))] ; O(1) lookup of the last element of the vec "known-primes"

    ; Ugh, ugly de-duplication :(
    (->> (map #(when (not= % %2) %) primes-with-duplicates (rest primes-with-duplicates))
         (remove nil?))))

Reported numbers (time in milliseconds to count first N primes) are the fastest from the run of 5, no JVM restarts between experiments so your mileage may vary:

                     1e6      3e6

(primes-fn-1  5)     808     2664
(primes-fn-1 10)     952     3198
(primes-fn-1 20)    1440     4742
(primes-fn-1 30)    1881     6030
(primes-fn-2)       1868     5922
(primes-fn-3)        489     1755  <-- WOW!
(primes-fn-4)       2024     8185 
0
votes

If you don't need a lazy solution and you just want a sequence of primes below a certain limit, the straight forward implementation of the Sieve of Eratosthenes is pretty fast. Here's my version using transients:

(defn classic-sieve
  "Returns sequence of primes less than N"
  [n]
  (loop [nums (transient (vec (range n))) i 2]
    (cond
     (> (* i i) n) (remove nil? (nnext (persistent! nums)))
     (nums i) (recur (loop [nums nums j (* i i)]
                       (if (< j n)
                         (recur (assoc! nums j nil) (+ j i))
                         nums))
                     (inc i))
     :else (recur nums (inc i)))))
0
votes

I just started using Clojure so I don't know if it's good but here is my solution:

(defn divides? [x i]
  (zero? (mod x i)))

(defn factors [x]
    (flatten (map #(list % (/ x %)) 
                 (filter #(divides? x %) 
                        (range 1 (inc (Math/floor (Math/sqrt x))))))))

(defn prime? [x]
  (empty? (filter #(and divides? (not= x %) (not= 1 %)) 
                 (factors x))))

(def primes 
  (filter prime? (range 2 java.lang.Integer/MAX_VALUE)))

(defn sum-of-primes-below [n]
  (reduce + (take-while #(< % n) primes)))