43
votes

I have a hard time to understand STArray from the documentation and other howtos/discussion I've found through Google. I've got some more related questions below.

According to the documentation, STArrays are

Mutable boxed and unboxed arrays in the ST monad.

This gave me the impression, that STArray is meant to be used as a state being passed around between functions (imagine you have a vector that has to be updated often).

Apparently this is used differently though:

ST s (STArray s a e)

What is the state s here? If it is used internally, then why is this not hidden from the user?

This also means, if we want to use a STArray s Int Int being passed around as state, one would define

type StateArray a = Control.Monad.State (ST s (STArray s Int Int)) a

which seems rather cumbersome.

Finally,

  • what is the difference between ST and State?
  • what is the difference between STArray and IOArray, if the ST and IO are meant for "internal" use?

Thank you!!

1
Usually Vectors are easier to use than arrays if you want to take a look at those.alternative

1 Answers

78
votes

ST is a monad in which a limited type of side effects are allowed, namely mutable references and mutable arrays. Thus it allows you to implement functions which are pure as seen from the outside world, but which use mutation internally.

This is different from State, which only fakes mutation by threading the state through your computation as extra inputs and outputs. The difference is important when implementing some imperative algorithms, because they sometimes need mutation to be implemented efficiently. For example using a regular array in a State monad, you can only modify it by making a copy, whereas with ST you can have true mutation in-place.

The reason why we have both ST and IO is that ST provides stronger guarantees than IO, namely:

  1. ST does not allow arbitrary side effects like for example accessing the file system.
  2. We can guarantee that the side effects ST does allow cannot escape the scope of runST, and so it can be seen as pure from the outside world.

The reason why we can guarantee that the side effects cannot escape is related to the type variable s. Since any ST action must be polymorphic in s, you cannot write code which allows any mutable references to enter or leave the scope of a runST, because the type checker will complain that it cannot guarantee that the s of your action and that of the reference or array are the same unless they come from the same runST scope.

As an example of using the ST monad with mutable arrays, here is an implementation of the Sieve of Erathostenes:

import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed

primesUpto :: Int -> [Int]
primesUpto n = [p | (p, True) <- assocs $ sieve n]

sieve :: Int -> UArray Int Bool
sieve n = runSTUArray $ do
    sieve <- newArray (2, n) True
    forM_ [2..n] $ \p -> do
        isPrime <- readArray sieve p
        when isPrime $ do
            forM_ [p*2, p*3 .. n] $ \k -> do
                writeArray sieve k False
    return sieve

runSTUArray is a specialized form of runST which allows you to build an array using mutation on the inside, before freezing it and returning it as an immutable array. newArray, readArray and writeArray do what you'd expect.

As you can see, the type signature of sieve indicates that it's a pure function, and it is. However, it uses mutation heavily on the inside to implement it efficiently.