28
votes

I'm attempting to learn Clojure from the API and documentation available on the site. I'm a bit unclear about mutable storage in Clojure and I want to make sure my understanding is correct. Please let me know if there are any ideas that I've gotten wrong.

Edit: I'm updating this as I receive comments on its correctness.


Disclaimer: All of this information is informal and potentially wrong. Do not use this post for gaining an understanding of how Clojure works.


Vars always contain a root binding and possibly a per-thread binding. They are comparable to regular variables in imperative languages and are not suited for sharing information between threads. (thanks Arthur Ulfeldt)

Refs are locations shared between threads that support atomic transactions that can change the state of any number of refs in a single transaction. Transactions are committed upon exiting sync expressions (dosync) and conflicts are resolved automatically with STM magic (rollbacks, queues, waits, etc.)

Agents are locations that enable information to be asynchronously shared between threads with minimal overhead by dispatching independent action functions to change the agent's state. Agents are returned immediately and are therefore non-blocking, although an agent's value isn't set until a dispatched function has completed.

Atoms are locations that can be synchronously shared between threads. They support safe manipulation between different threads.

Here's my friendly summary based on when to use these structures:

  • Vars are like regular old variables in imperative languages. (avoid when possible)
  • Atoms are like Vars but with thread-sharing safety that allows for immediate reading and safe setting. (thanks Martin)
  • An Agent is like an Atom but rather than blocking it spawns a new thread to calculate its value, only blocks if in the middle of changing a value, and can let other threads know that it's finished assigning.
  • Refs are shared locations that lock themselves in transactions. Instead of making the programmer decide what happens during race conditions for every piece of locked code, we just start up a transaction and let Clojure handle all the lock conditions between the refs in that transaction.

Also, a related concept is the function future. To me, it seems like a future object can be described as a synchronous Agent where the value can't be accessed at all until the calculation is completed. It can also be described as a non-blocking Atom. Are these accurate conceptions of future?

5

5 Answers

6
votes

It sounds like you are really getting Clojure! good job :)

Vars have a "root binding" visible in all threads and each individual thread can change the value it sees with out affecting the other threads. If my understanding is correct a var cannot exist in just one thread with out a root binding that is visible to all and it cant be "rebound" until it has been defined with (def ... ) the first time.

Refs are committed at the end of the (dosync ... ) transaction that encloses the changes but only when the transaction was able to finish in a consistent state.

4
votes

I think your conclusion about Atoms is wrong:

Atoms are like Vars but with thread-sharing safety that blocks until the value has changed

Atoms are changed with swap! or low-level with compare-and-set!. This never blocks anything. swap! works like a transaction with just one ref:

  1. the old value is taken from the atom and stored thread-local
  2. the function is applied to the old value to generate a new value
  3. if this succeeds compare-and-set is called with old and new value; only if the value of the atom has not been changed by any other thread (still equals old value), the new value is written, otherwise the operation restarts at (1) until is succeeds eventually.
3
votes

I've found two issues with your question.

You say:

If an agent is accessed while an action is occurring then the value isn't returned until the action has finished

http://clojure.org/agents says:

the state of an Agent is always immediately available for reading by any thread

I.e. you never have to wait to get the value of an agent (I assume the value changed by an action is proxied and changed atomically).

The code for the deref-method of an Agent looks like this (SVN revision 1382):

public Object deref() throws Exception{
    if(errors != null)
    {
        throw new Exception("Agent has errors", (Exception) RT.first(errors));
    }
return state;

}

No blocking is involved.

Also, I don't understand what you mean (in your Ref section) by

Transactions are committed on calls to deref

Transactions are committed when all actions of the dosync block have been completed, no exceptions have been thrown and nothing has caused the transaction to be retried. I think deref has nothing to do with it, but maybe I misunderstand your point.

1
votes

Martin is right when he say that Atoms operation restarts at 1. until is succeeds eventually. It is also called spin waiting. While it is note really blocking on a lock the thread that did the operation is blocked until the operation succeeds so it is a blocking operation and not an asynchronously operation.

Also about Futures, Clojure 1.1 has added abstractions for promises and futures. A promise is a synchronization construct that can be used to deliver a value from one thread to another. Until the value has been delivered, any attempt to dereference the promise will block.

(def a-promise (promise))
(deliver a-promise :fred)

Futures represent asynchronous computations. They are a way to get code to run in another thread, and obtain the result.

(def f (future (some-sexp)))
(deref f) ; blocks the thread that derefs f until value is available
0
votes

Vars don't always have a root binding. It's legal to create a var without a binding using

(def x)

or

(declare x)

Attempting to evaluate x before it has a value will result in

Var user/x is unbound.
[Thrown class java.lang.IllegalStateException]