107
votes

I have tried reading up on this but I still don't understand the value of them or what they replace. And do they make my code shorter, more understandable or what?

Update

Alot of people posted answers, but it would be nice to see examples of with and without transducers for something very simple, which even an idiot like me can understand. Unless of course transducers need a certain high level of understanding, in which case I will never understand them :(

12

12 Answers

78
votes

Transducers are recipes of what to do with a sequence of data without knowledge of what the underlying sequence is (how to do it). It can be any seq, async channel or maybe observable.

They are composable and polymorphic.

The benefit is, you don't have to implement all standard combinators every time a new data source is added. Again and again. As a result, you as user are able to reuse those recipes on different data sources.

Prior to version 1.7 of Clojure you had three ways to write dataflow queries:

  1. nested calls

    (reduce + (filter odd? (map #(+ 2 %) (range 0 10))))
    
  2. functional composition

    (def xform
      (comp
        (partial filter odd?)
        (partial map #(+ 2 %))))
    (reduce + (xform (range 0 10)))
    
  3. threading macro

    (defn xform [xs]
      (->> xs
           (map #(+ 2 %))
           (filter odd?)))
    (reduce + (xform (range 0 10)))
    

With transducers you will write it like:

(def xform
  (comp
    (map #(+ 2 %))
    (filter odd?)))
(transduce xform + (range 0 10))

They all do the same. The difference is that you never call transducers directly, you pass them to another function. Transducers know what to do, the function that gets a transducer knows how. The order of combinators is like you write it with threading macro (natural order). Now you can reuse xform with channel:

(chan 1 xform)
50
votes

Transducers improve efficiency, and allow you to write efficient code in a more modular way.

This is a decent run through.

Compared to composing calls to the old map, filter, reduce etc. you get better performance because you don't need to build intermediate collections between each step, and repeatedly walk those collections.

Compared to reducers, or manually composing all your operations into a single expression, you get easier to use abstractions, better modularity and reuse of processing functions.

25
votes

Say you want to use a series of functions to transform a stream of data. The Unix shell lets you do this kind of thing with the pipe operator, e.g.

cat /etc/passwd | tr '[:lower:]' '[:upper:]' | cut -d: -f1| grep R| wc -l

(The above command counts the number of users with the letter r in either upper- or lowercase in their username). This is implemented as a set of processes, each of which reads from the previous processes's output, so there are four intermediate streams. You could imagine a different implementation that composes the five commands into a single aggregate command, which would read from its input and write its output exactly once. If intermediate streams were expensive, and composition were cheap, that might be a good trade-off.

The same kind of thing holds for Clojure. There are multiple ways to express a pipeline of transformations, but depending on how you do it, you can end up with intermediate streams passing from one function to the next. If you have a lot of data, it's faster to compose those functions into a single function. Transducers make it easy to do that. An earlier Clojure innovation, reducers, let you do that too, but with some restrictions. Transducers remove some of those restrictions.

So to answer your question, transducers won't necessarily make your code shorter or more understandable, but your code probably won't be longer or less understandable either, and if you're working with a lot of data, transducers can make your code faster.

This is a pretty good overview of transducers.

22
votes

Transducers are a means of combination for reducing functions.

Example: Reducing functions are functions that take two arguments: A result so far and an input. They return a new result (so far). For example +: With two arguments, you can think of the first as the result so far and the second as the input.

A transducer could now take the + function and make it a twice-plus function (doubles every input before adding it). This is how that transducer would look like (in most basic terms):

(defn double
  [rfn]
  (fn [r i] 
    (rfn r (* 2 i))))

For illustration substitute rfn with + to see how + is transformed into twice-plus:

(def twice-plus ;; result of (double +)
  (fn [r i] 
    (+ r (* 2 i))))

(twice-plus 1 2)  ;-> 5
(= (twice-plus 1 2) ((double +) 1 2)) ;-> true

So

(reduce (double +) 0 [1 2 3]) 

would now yield 12.

Reducing functions returned by transducers are independent of how the result is accumulated because they accumulate with the reducing function passed to them, unknowingly how. Here we use conj instead of +. Conj takes a collection and a value and returns a new collection with that value appended.

(reduce (double conj) [] [1 2 3]) 

would yield [2 4 6]

They are also independent of what kind of source the input is.

Multiple transducers can be chained as a (chainable) recipe to transform reducing functions.

Update: Since there now is an official page about it, I highly recommend to read it: http://clojure.org/transducers

10
votes

Rich Hickey gave a 'Transducers' talk at the Strange Loop 2014 conference (45 min).

He explains in simple way what transducers are, with real world examples - processing bags in an airport. He clearly separates the different aspects and contrasts them with the current approaches. Towards the end, he gives the rationale for their existence.

Video: https://www.youtube.com/watch?v=6mTbuzafcII

8
votes

I've found reading examples from transducers-js helps me understand them in concrete terms of how I might use them in day-to-day code.

For instance, consider this example (taken from the README at the link above):

var t = require("transducers-js");

var map    = t.map,
    filter = t.filter,
    comp   = t.comp,
    into   = t.into;

var inc    = function(n) { return n + 1; };
var isEven = function(n) { return n % 2 == 0; };
var xf     = comp(map(inc), filter(isEven));

console.log(into([], xf, [0,1,2,3,4])); // [2,4]

For one, using xf looks much cleaner than the usual alternative with Underscore.

_.filter(_.map([0, 1, 2, 3, 4], inc), isEven);
8
votes

Transducers are (to my understanding!) functions which take one reducing function and return another. A reducing function is one which

For example:

user> (def my-transducer (comp count filter))
#'user/my-transducer
user> (my-transducer even? [0 1 2 3 4 5 6])
4
user> (my-transducer #(< 3 %) [0 1 2 3 4 5 6])
3

In this case my-transducer takes an input filtering function which it applies to 0 then if that value is even? in the first case the filter passes that value to the counter, then it filters the next value. Instead of first filtering and then passing all of those values over to count.

It is the same thing in the second example it checks one value at a time and if that value is less than 3 then it lets count add 1.

7
votes

A transducer clear definition is here:

Transducers are a powerful and composable way to build algorithmic transformations that you can reuse in many contexts, and they’re coming to Clojure core and core.async.

To understand it, let's consider the following simple example:

;; The Families in the Village

(def village
  [{:home :north :family "smith" :name "sue" :age 37 :sex :f :role :parent}
   {:home :north :family "smith" :name "stan" :age 35 :sex :m :role :parent}
   {:home :north :family "smith" :name "simon" :age 7 :sex :m :role :child}
   {:home :north :family "smith" :name "sadie" :age 5 :sex :f :role :child}

   {:home :south :family "jones" :name "jill" :age 45 :sex :f :role :parent}
   {:home :south :family "jones" :name "jeff" :age 45 :sex :m :role :parent}
   {:home :south :family "jones" :name "jackie" :age 19 :sex :f :role :child}
   {:home :south :family "jones" :name "jason" :age 16 :sex :f :role :child}
   {:home :south :family "jones" :name "june" :age 14 :sex :f :role :child}

   {:home :west :family "brown" :name "billie" :age 55 :sex :f :role :parent}
   {:home :west :family "brown" :name "brian" :age 23 :sex :m :role :child}
   {:home :west :family "brown" :name "bettie" :age 29 :sex :f :role :child}

   {:home :east :family "williams" :name "walter" :age 23 :sex :m :role :parent}
   {:home :east :family "williams" :name "wanda" :age 3 :sex :f :role :child}])

What about it we want to know how many children are in the village? We can easily find it out with the following reducer:

;; Example 1a - using a reducer to add up all the mapped values

(def ex1a-map-children-to-value-1 (r/map #(if (= :child (:role %)) 1 0)))

(r/reduce + 0 (ex1a-map-children-to-value-1 village))
;;=>
8

Here is another way to do it:

;; Example 1b - using a transducer to add up all the mapped values

;; create the transducers using the new arity for map that
;; takes just the function, no collection

(def ex1b-map-children-to-value-1 (map #(if (= :child (:role %)) 1 0)))

;; now use transduce (c.f r/reduce) with the transducer to get the answer 
(transduce ex1b-map-children-to-value-1 + 0 village)
;;=>
8

Besides, it is really powerful when taking subgroups in account as well. For instance, if we would like to know how many children are in Brown Family, we can execute:

;; Example 2a - using a reducer to count the children in the Brown family

;; create the reducer to select members of the Brown family
(def ex2a-select-brown-family (r/filter #(= "brown" (string/lower-case (:family %)))))

;; compose a composite function to select the Brown family and map children to 1
(def ex2a-count-brown-family-children (comp ex1a-map-children-to-value-1 ex2a-select-brown-family))

;; reduce to add up all the Brown children
(r/reduce + 0 (ex2a-count-brown-family-children village))
;;=>
2

I hope you can find helpful these examples. You can find more here

Hope it helps.

Clemencio Morales Lucas.

4
votes

I blogged about this with a clojurescript example which explains how the sequence functions are now extensible by being able to replace the reducing function.

This is the point of transducers as I read it. If you think about the cons or conj operation that is hard coded in operations like map, filter etc., the reducing function was unreachable.

With transducers, the reducing function is decoupled and I can replace it like I did with the native javascript array push thanks to transducers.

(transduce (filter #(not (.hasOwnProperty prevChildMapping %))) (.-push #js[]) #js [] nextKeys)

filter and friends have a new 1 arity operation that will return a transducing function that you can use to supply your own reducing function.

4
votes

Here's my (mostly) jargon and code free answer.

Think of data in two ways, a stream (values that occur over time such as events) or a structure (data that exists at a point in time such as a list, a vector, an array etc).

There are certain operations that you might want to perform over either streams or structures. One such operation is mapping. A mapping function might increment each data item (assuming it is a number) by 1 and you can hopefully imagine how this could apply to either a stream or a structure.

A mapping function is just one of a class of functions that are sometimes referred to as "reducing functions". Another common reducing function is filter which remove values that match a predicate (e.g. remove all values that are even).

Transducers let you "wrap" a sequence of one or more reducing functions and produce a "package" (which is itself a function) that works on both streams or structures. For example, you could "package" a sequence of reducing functions (e.g. filter even numbers, then map the resulting numbers to increment them by 1) and then use that transducer "package" on either a stream or structure of values (or both).

So what is special about this? Typically, reducing functions not able to be efficiently composed to work on both streams and structures.

So the benefit to you is that you can leverage your knowledge around these functions and apply them to more use cases. The cost to you is that you have to learn some extra machinery (i.e. the transducer) to give you this extra power.

3
votes

As far as I understand, they're like building blocks, decoupled from the input and output implementation. You just define the operation.

As the implementation of the operation is not in the input's code and nothing is done with the output, transducers are extremely reusable. They remind me of Flows in Akka Streams.

I'm also new to transducers, sorry for the possibly-unclear answer.

1
votes