4
votes

I'm still fairly new to clojure, but a pattern that I find myself using frequently in it goes something like this: I have some collections and I want to build a new collection, usually a hash-map, out of them with some filters or conditions. There are always a few ways to do this: using loop or using reduce combined with map/filter for example, but I would like to implement something more like the for macro, which has great syntax for controlling what gets evaluated in the loop. I'd like to produce a macro with syntax that goes like this:

(defmacro build
  "(build sym init-val [bindings...] expr) evaluates the given expression expr
   over the given bindings (treated identically to the bindings in a for macro); 
   the first time expr is evaluated the given symbol sym is bound to the init-val
   and every subsequent time to the previous expr. The return value is the result
   of the final expr. In essence, the build macro is to the reduce function
   as the for macro is to the map function.

   Example:
     (build m {} [x (range 4), y (range 4) :when (not= x y)]
       (assoc m x (conj (get m x #{}) y)))
      ;; ==> {0 #{1 3 2}, 1 #{0 3 2}, 2 #{0 1 3}, 3 #{0 1 2}}"
  [sym init-val [& bindings] expr]
  `(...))

Looking at the for code in clojure.core, it's pretty clear that I don't want to re-implement its syntax myself (even ignoring the ordinary perils of duplicating code), but coming up with for-like behavior in the above macro is a lot trickier than I initially expected. I eventually came up with the following, but I feel that (a) this probably isn't terribly performant and (b) there ought to be a better, still clojure-y, way to do this:

(defmacro build
  [sym init-val bindings expr]
  `(loop [result# ~init-val, s# (seq (for ~bindings (fn [~sym] ~expr)))]
     (if s#
       (recur ((first s#) result#) (next s#))
       result#))
   ;; or `(reduce #(%2 %1) ~init-val (for ~bindings (fn [~sym] ~expr)))

My specific questions:

  1. Is there a built-in clojure method or library that solves this already, perhaps more elegantly?
  2. Can someone who is more familiar with clojure performance give me an idea of whether this implementation is problematic and whether/how much I should be worried about performance, assuming that I may use this macro very frequently for relatively large collections?
  3. Is there any good reason that I should use the loop over the reduce version of the macro above, or vice versa?
  4. Can anyone see a better implementation of the macro?
2

2 Answers

1
votes

Your reduce version was also my first approach based on the problem statement. I think it's nice and straightforward and I'd expect it to work very well, particularly since for will produce a chunked seq that reduce will be able to iterate over very quickly.

for generates functions to do output generation anyway and I wouldn't expect the extra layer introduced by the build expansion to be particularly problematic. It may still be worthwhile to benchmark this version based on volatile! as well:

(defmacro build [sym init-val bindings expr]
  `(let [box# (volatile! ~init-val)] ; AtomicReference would also work
     (doseq ~bindings
       (vreset! box# (let [~sym @box#] ~expr)))
     @box#))

Criterium is great for benchmarking and will eliminate any performance-related guesswork.

0
votes

I don't want to quite take your example code of your doc string since it's not idiomatic clojure. But taking plumbing.core's for-map, you can come up with a similar for-map-update:

(defn update!
  "Like update but for transients."
  ([m k f] (assoc! m k (f (get m k))))
  ([m k f x1] (assoc! m k (f (get m k) x1)))
  ([m k f x1 x2] (assoc! m k (f (get m k) x1 x2)))
  ([m k f x1 x2 & xs] (assoc! m k (apply f (get m k) x1 x2 xs))))

(defmacro for-map-update
  "Like 'for-map' for building maps but accepts a function as the value to build map values."
  ([seq-exprs key-expr val-expr]
   `(for-map-update ~(gensym "m") ~seq-exprs ~key-expr ~val-expr))
  ([m-sym seq-exprs key-expr val-expr]
   `(let [m-atom# (atom (transient {}))]
      (doseq ~seq-exprs
        (let [~m-sym @m-atom#]
          (reset! m-atom# (update! ~m-sym ~key-expr ~val-expr))))
      (persistent! @m-atom#))))

(for-map-update
  [x (range 4)
   y (range 4)
   :when (not= x y)]
  x (fnil #(conj % y) #{} ))
;; => {0 #{1 3 2}, 1 #{0 3 2}, 2 #{0 1 3}, 3 #{0 1 2}}