Purity, Referential Transparency and State Monad

Question

I'm currently designing a numerical algorithm which as part of its operations requires updating a vector of doubles many times. Due to the fact that the algorithm has to be as space and time efficient as possible, I do not want to code the traditional type of FP code which creates many versions of the data structure under the hood after each operation on it. Neither do I want to create mutable data structures and have them globally available. Consequently, I have decided to use a mutable data structure but then choose to do the the mutable operations in a State monad. Since this is my first stab at using a State monad, I want to confirm the whether I have or have not

preserved referential transparency
maintained functional purity

The update function transitions the data structure state. Since the destructive update is localised within this function and a handle to the data structure cannot be got at from outside, I think this function is pure and referentially transparent.

def update(i : Int,d : Double) = State[ArraySeq[Double], Unit]{
  case xs: ArraySeq[Double] => {xs(i) = d; (xs, ())}
}

The app function is a toy function which will consume a sequence of doubles and modify it's state:

def app : State[ArraySeq[Double], Unit] = for{
    _ <- update(0, 3.142)
  // do a heap of stuff on ArraySeq
}yield()

Call:

app(Vector(0.0, 1.0, 2.0, 3.0, 4.0).to[ArraySeq])._1.to[Vector]

Result:

res0: Vector[Double] = Vector(3.142, 1.0, 2.0, 3.0, 4.0)

Are you using scalaz's State or one you've written yourself? — Daenyth

Travis Brown Travis Brown · Accepted Answer · 2015-06-02T02:44:53

I guess you could say that your update itself is pure, in the sense that it only represents some mutation, but as soon as you run it all bets are off:

scala> val xs = List(1.0, 2.0, 3.0).to[ArraySeq]
xs: scala.collection.mutable.ArraySeq[Double] = ArraySeq(1.0, 2.0, 3.0)

scala> update(0, 10).eval(xs)
res0: scalaz.Id.Id[Unit] = ()

scala> xs
res1: scala.collection.mutable.ArraySeq[Double] = ArraySeq(10.0, 2.0, 3.0)

This is a bad scene, and it's the opposite of pure or referentially transparent.

State isn't really buying you anything in your example—the fact that you're calling app in such a way that you have an ArraySeq that nobody else can mutate is. You might as well bite the bullet and work with a mutable data structure in the usual way in a scope that you control—i.e., write app like this:

def app(xs: Vector[Double]): Vector[Double] = {
  val arr = xs.to[ArraySeq]
  // Perform all your updates in the usual way
  arr.toVector
}

This actually is pure and referentially transparent, but it's also more honest than the State version. If I see a value of type State[Foo, Unit], my assumption is going to be that this value represents some kind of operation that changes a Foo into a new Foo, without mutating the original Foo. This is all the state monad is—it provides a nice way of modeling operations on immutable data structures and composing them in a way that looks kind of like mutation. If you mix it with actual mutation you're likely to confuse the hell out of anyone using your code.

If you really want real mutation and purity at the same time, you can look at Scalaz's STArray. It's a very clever solution to this problem, and in languages like Haskell it's an approach that makes a lot of sense. My own feeling is that it's pretty much always the wrong solution in Scala, though. If you really need the performance of a mutable array, just use a local mutable array and make sure you don't leak it to the outside world. If you don't need that kind of performance (and most of the time you don't), use something like State.

Purity, Referential Transparency and State Monad

1 Answers