How to use thread safe shared variables in Haskell

Question

IORefs, MVars, and TVars can be used to wrap a shared variable in a concurrent context. I've studied concurrent haskell for a while and now I've encounted some questions. After searching on stackoverflow and read through some related question, my questions are not fully resolved.

According to the IORef documentation,"Extending the atomicity to multiple IORefs is problematic", can someone help to explain why a single IORef is safe but more than one IORefs are problematic?
modifyMVar is "exception-safe, but only atomic if there are no other producers for this MVar". See MVar's documentation. The source code show that modifyMVar does only compose a getMVar and putMVar sequencially, indicating that it's note thread-safe if there is another producer. But if there is no producer and all threads behave in the "takeMVar then putMVar" way, then is it thread-safe to simply use modifyMVar ?

To give a concrete situation, I'll show the actual problem. I've some shared variables which are never empty and I want them be mutable states so some threads can simultaneously modify these variables.

OK, it seems tha TVar solve everything clearly. But I'm not satisfied with it and I'm eager for answers to the questions above. Any help are appreciated.

-------------- re: @GabrielGonzalez BFS interface code ------------------

Code below is my BFS interface using state monad.

{-# LANGUAGE TypeFamilies, FlexibleContexts #-}

module Data.Graph.Par.Class where

import Data.Ix
import Data.Monoid
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad
import Control.Monad.Trans.State

class (Ix (Vertex g), Ord (Edge g), Ord (Path g)) => ParGraph g where
  type Vertex g :: *
  type Edge g :: * 
  -- type Path g :: *           -- useless
  type VertexProperty g :: *
  type EdgeProperty g :: *
  edges :: g a -> IO [Edge g]
  vertexes :: g a -> IO [Vertex g]
  adjacencies :: g a -> Vertex g -> IO [Vertex g]
  vertexProperty :: Vertex g -> g a -> IO (VertexProperty g)
  edgeProperty :: Edge g -> g a -> IO (EdgeProperty g)
  atomicModifyVertexProperty :: (VertexProperty g -> IO (VertexProperty g)) -> 
                                Vertex g -> g a -> IO (g a) -- fixed 

spanForest :: ParGraph g => [Vertex g] -> StateT (g a) IO ()
spanForest roots = parallelise (map spanTree roots) -- parallel version

spanForestSeq :: ParGraph g => [Vertex g] -> StateT (g a) IO ()
spanForestSeq roots = forM_ roots spanTree -- sequencial version

spanTree :: ParGraph g => Vertex g -> StateT (g a) IO ()
spanTree root = spanTreeOneStep root >>= \res -> case res of
  [] -> return ()
  adjs -> spanForestSeq adjs

spanTreeOneStep :: ParGraph g => Vertex g -> StateT (g a) IO [Vertex g]
spanTreeOneStep v = StateT $ \g -> adjacencies g v >>= \adjs -> return (adjs, g)

parallelise :: (ParGraph g, Monoid b) => [StateT (g a) IO b] -> StateT (g a) IO b
parallelise [] = return mempty
parallelise ss = syncGraphOp $ map forkGraphOp ss

forkGraphOp :: (ParGraph g, Monoid b) => StateT (g a) IO b -> StateT (g a) IO (MVar b)
forkGraphOp t = do 
  s <- get
  mv <- mapStateT (forkHelper s) t
  return mv
  where
    forkHelper s x = do
      mv <- newEmptyMVar
      forkIO $ x >>= \(b, s) -> putMVar mv b
      return (mv, s)

syncGraphOp :: (ParGraph g, Monoid b) => [StateT (g a) IO (MVar b)] -> StateT (g a) IO b
syncGraphOp [] = return mempty
syncGraphOp ss = collectMVars ss >>= waitResults
  where
    collectMVars [] = return []
    collectMVars (x:xs) = do
      mvx <- x
      mvxs <- collectMVars xs
      return (mvx:mvxs)
    waitResults mvs = StateT $ \g -> forM mvs takeMVar >>= \res -> return ((mconcat res), g)

Why are you not satisfied with TVars? stm is Haskell's really elegant solution to concurrency. I have never met a problem I could not solve using software-transactional memory. — Gabriella Gonzalez
I'm not sure "thread-safety" is the appropriate word in talking about the behavior of modifyMVar; your program isn't going to segfault or blow up like it would in python, you'll just get "blocked indefinitely" exceptions thrown in your threads. — jberryman
@GabrielGonzalez I guess STM's implementaion using "log then commit" is relatively uneffcient and may cause too much overhead. But I have not compare the effeciency quantitatively, though. — pysuxing
@pysuxing That is not how MVars behave. Calling takeMVar as modifyMVar does will leave the MVar empty until a putMVar is called. Your example will result in thread B blocking on the MVar until thread A completes its putMVar operation and will result in thread B reading the value put in the MVar by thread A. In your example, B cannot proceed until A finishes unless some other thread puts something into the MVar. — sabauma
Ok, now I think I understand your use. You only want each VertexProperty to be modifiable atomically. You don't care about being able to modify multiple vertices at a time, so atomicModifyIORef should be fine for your problem. However, if you ever want to combine multiple modifications into a single atomic modification you will want to break out the STM. — Gabriella Gonzalez

Daniel Wagner Daniel Wagner · Accepted Answer · 2013-04-07T01:37:55

Modern processors offer a compare-and-swap instruction that atomically modifies a single pointer. I expect if you track down deep enough, you will find that this instruction is the one used to implement atomicModifyIORef. It is therefore easy to provide atomic access to a single pointer. However, because there isn't such hardware support for more than one pointer, whatever you need will have to be done in software. This typically involves inventing and manually enforcing a protocol in all your threads -- which is complicated and error-prone.
Yes, if all threads agree to only use the "single takeMVar followed by a single putMVar" behavior, then modifyMVar is safe.

How to use thread safe shared variables in Haskell

1 Answers