4
votes

I am wondering if the returned enumerator is thread safe:

public IEnumerator<T> GetEnumerator()
{
    lock (_sync) {
        return _list.GetEnumerator();
    }
}

If I have multiple threads whom are adding data (also within lock() blocks) to this list and one thread enumerating the contents of this list. When the enumerating thread is done it clears the list. Will it then be safe to use the enumerator gotten from this method.

I.e. does the enumerator point to a copy of the list at the instance it was asked for or does it always point back to the list itself, which may or may not be manipulated by another thread during its enumeration?

If the enumerator is not thread safe, then the only other course of action I can see is to create a copy of the list and return that. This is however not ideal as it will generate lots of garbage (this method is called about 60 times per second).

3
For List of T, No, it isn't. You might want to consider immutable collections ( msdn.microsoft.com/en-us/library/dn385366(v=vs.110).aspx ), or at the very least, collections that are explicitly threadsafe, such as ConcurrentQueue - spender
Yeah ConcurrentQueue seems to fit the bill, thanks! - JensB
If anyone is interested in how MS has done the ConcurrentQueue I found their code implementation here: dotnetframework.org/default.aspx/4@0/4@0/untmp/DEVDIV_TFS/Dev10/… (very slow webpage) - JensB
Could anayone point to documentation of how ConcurrentQueue does its thread safety? I'm assuming it will be better implemented then what I can do but would be nice to see what the engineers at microsoft have done. - JensB

3 Answers

6
votes

No, not at all. This lock synchronizes only access to _list.GetEnumerator method; where as enumerating a list is lot more than that. It includes reading the IEnumerator.Current property, calling IEnumerator.MoveNext etc.

You either need a lock over the foreach(I assume you enumerate via foreach), or you need to make a copy of list.

Better option is to take a look at Threadsafe collections provided out of the box.

2
votes

According to the documentation, to guarantee thread-safity you have to lock collecton during entire iteration over it.

The enumerator does not have exclusive access to the collection; therefore, enumerating through a collection is intrinsically not a thread-safe procedure. To guarantee thread safety during enumeration, you can lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

Another option, may be define you own custom iterator, and for every thread create a new instance of it. So every thread will have it's own copy of Current read-only pointer to the same collection.

1
votes

If I have multiple threads whom are adding data (also within lock() blocks) to this list and one thread enumerating the contents of this list. When the enumerating thread is done it clears the list. Will it then be safe to use the enumerator gotten from this method.

No. See reference here: http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.aspx

An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and the next call to MoveNext or Reset throws an InvalidOperationException. If the collection is modified between MoveNext and Current, Current returns the element that it is set to, even if the enumerator is already invalidated. The enumerator does not have exclusive access to the collection; therefore, enumerating through a collection is intrinsically not a thread-safe procedure. Even when a collection is synchronized, other threads can still modify the collection, which causes the enumerator to throw an exception. To guarantee thread safety during enumeration, you can either lock the collection during the entire enumeration or catch the exceptions resulting from changes made by other threads.

..

does the enumerator point to a copy of the list at the instance it was asked for or does it always point back to the list itself, which may or may not be manipulated by another thread during its enumeration?

Depends on the collection. See the Concurrent Collections. Concurrent Stack, ConcurrentQueue, ConcurrentBag all take snapshots of the collection when GetEnumerator() is called and returns elements from the snapshot. The underlying collection may change without changing the snapshot. On the other hand, ConcurrentDictionary doesn't take a snapshot, so changing the collection while iterating will immediately effect things per the rules above.

A trick I sometimes use in this case is to create a temporary collection to iterate so the original is free while I use the snapshot:

foreach(var item in items.ToList()) {
    //
}

If your list is too large and this causes GC churn, then locking is probably your best bet. If locking is too heavy, maybe consider a partial iteration each timeslice, if that is feasible.

You said:

When the enumerating thread is done it clears the list.

Nothing says you have to process the whole list at a time. You could instead remove a range of items, move them to a separate enumerating thread, let that process, and repeat. Perhaps iteratation and lists aren't the best model here. Consider the ConcurrentQueue with which you could build producers and consumer model, and consumers just steadily remove items to process without iteration