Relationship between shared memory concurrency algorithms and mutexes/semaphores

Question

I am trying to figure out the relationship between shared memory based concurrency algorithms (Peterson's / Bakery) and the use of semaphores and mutexes.

In the first case, we have a system without OS intervention, and processes can synchronize themselves using shared memory and busy waiting.

In the second case, the OS provides processes/threads with the ability to block, and not have to busy wait.

Is there ever a situation where we'd like to use shared memory in addition to semaphores (to ensure fairness / lack of starvation), or does the OS offer a better way to do this?

(I am wondering about the general concepts, but answers specific to POSIX/Win32/JAVA threads are also interesting).

Thanks a lot!

Anthony Williams Anthony Williams · Accepted Answer · 2012-03-13T13:44:55

I can't think of any circumstances where what you actually want is a busy wait. Busy waiting just consumes processor time without achieving anything. That's not to say that "busy wait" algorithms aren't useful (they are), but the "busy wait" part is not the desired property, it is just a necessary consequence of a property that is desired.

Peterson's lock algorithm and Lamport's bakery algorithm are fundamentally just implementations of the mutex concept. OS facilities provide implementations of the same concept, but with different trade-offs.

The "ideal" implementation of a mutex would have "zero overhead" --- acquiring a lock on a mutex would not take any time at all if it was not currently owned, a waiting thread would wake the instant that the prior owner released the lock, and in the mean time, the waiting thread would not consume any processor time.

A "busy wait" or "spin lock" algorithm trades processor time used by the waiting thread for a reduced wake-up time. Provided the thread is currently scheduled on a processor, a busy-waiter will wake as fast as the processor can transfer the necessary data for acquiring the lock and synchronizing the threads, but whilst it is waiting it will consume its maximum allotment of processor time. If the number of threads exceeds the number of available processors, this may well take time from the thread that currently owns the mutex, thus making the wait longer. However, in some cases the low latency between unlocking and locking is worth the trade-off.

On the other hand, a "blocking" mutex that uses OS facilities to put a waiting thread to sleep has a different trade-off. In this case, the time between unlocking a mutex and a waiting thread acquiring it can be quite large, possibly several hundred times larger than with a busy-wait algorithm. The benefit is that the waiting thread really does consume no processor time whilst waiting, so the OS can schedule other work whilst the thread is waiting. This can thus potentially reduce the overall wait time, and increase the overall throughput of the system.

Some mutex implementations use a combination of busy-waiting and blocking: they busy-wait for a short time, and then switch to blocking if the lock cannot be acquired in the short time. This has the benefits of the fast wake if the lock is released shortly after the thread began waiting, whilst consuming no processor time if the thread has to wait a long time. It also has the downsides of high processor usage for short waits, and slow wake-ups for long waits.

Relationship between shared memory concurrency algorithms and mutexes/semaphores

1 Answers