5
votes

What is the most efficient and idiomatic way to combine two or more large vectors together? This is what I have been doing. In my application I'm using matrices, so each operation is a bit more expensive than adding two doubles. Using range to drive the fold feels a bit clumsy.

(require '[clojure.core.reducers :as r])

(def a (mapv (fn [_] (rand 100)) (range 100000)))
(def b (mapv (fn [_] (rand 100)) (range 100000)))
(r/foldcat (r/map #(+ (a %) (b %)) (range (count a))))

Also calculating that range could end up being the most costly bit on multi-core CPUs since it's the only non-parallel part and involve sequences.

1
I assume it's mapv instead of vmap?guilespi
Are a and b actually matrices instead of vectors?Daniel Compton
Can you describe the shape of your data and actual computation further? It's possible something like core.matrix may be a better fit.Daniel Compton
Yes mapv. I am using core.matrix ( vectorz-clj ). Typically I have one vector of 3x3 matrices and another of vector3's and I need to multiply them together in a pairwise manner, like the pattern I presented here. This is for a large computational dynamics problem. There may be 100k elements in each vector, so parallelism is greatly desired. Also I'm using clojure for prototyping because I can develop and experiment much more efficiently in Clojure. So I do want to keep things simple and take advantage of conveniences like core.reducersMichael Alexander Ewert

1 Answers

0
votes

Actually looks like Clojure 1.8 has a pretty good answer, with the pattern in place already in Clojure 1.7 using map-index.

Ideally I'd like a map-index that takes multiple collections like map does, but this will do. It looks fairly clojuresque, unlike my kludgy fold over a range.

(defn combine-with [op a-coll] (fn [i b-el] (op (a-coll i) b-el)))

(map-indexed (combine-with + a) b)

Just need to wait for 1.8 for the performance: http://dev.clojure.org/jira/browse/CLJ-1553

Here are some timings on a 6 core CPU:

(def irange (vec (range (count a))))  ; precompute

(c/quick-bench (def ab (r/foldcat (r/map #(+ (a %) (b %)) irange))))
             Execution time mean : 1.426060 ms

(c/quick-bench (def abt (into [] (map-indexed (combine-with + a)) b)))
             Execution time mean : 9.931824 ms