What are the benefits of using a specifically designed spinlock (e.g. http://anki3d.org/spinlock) vs. code like this:
std::mutex m;
while (!m.try_lock()) {}
# do work
m.unlock();
What are the benefits of using a specifically designed spinlock (e.g. http://anki3d.org/spinlock) vs. code like this:
std::mutex m;
while (!m.try_lock()) {}
# do work
m.unlock();
On typical hardware, there are massive benefits:
Your naive "fake spinlock" may saturate internal CPU buses while the CPU spins, starving other physical cores including the physical core that holds the lock.
If the CPU supports hyper-threading or something similar, your naive "fake spinlock" may consume excessive execution resources on the physical core, starving another thread sharing that physical core.
Your naive "fake spinlock" probably does extraneous write operations that result in bad cache behavior. When you perform a read-modify-write operation on an x86/x86_64 CPU (like the compare/exchange that try_lock probably does), it always writes even if the value isn't changed. This write causes the cache line to be invalidated on other cores, requiring them to re-share it when another core accesses that line. This is awful if threads on other cores contend for the same lock at the same time.
Your naive "fake spinlock" interacts badly with branch prediction. When you finally do get the lock, you take the mother of all mispredicted branches right at the point where you are locking out other threads and need to execute as quickly as possible. This is like a runner being all pumped up and ready to run at the starting line but then when he hears the starting pistol, he stops to catch his breath.
Basically, that code does everything wrong that it is possible for a spinlock to do wrong. Absolutely nothing is done efficiently. Writing good synchronization primitives requires deep hardware expertise.
The main benefit of using a spinlock is that it is extremely cheap to acquire and release if the all-important precondition is true: There is little or no congestion on the lock.
If you know with sufficient certitude that there will be no contention, a spinlock will greatly outperform a naive implementation of a mutex which will go through library code doing validations that you don't necessarily need, and do a syscall. This means doing a context switch (consuming several hundreds of cycles), and abandoning the thread's time slice and causing your thread to be rescheduled. This may take an indefinite time -- even if the lock would be available almost immediately afterwards, you can still have to wait several dozen milliseconds before your thread runs again in unfavorable conditions.
If, however, the precondition of no contention does not hold, a spinlock will usually be vastly inferior as it makes no progress, but it still consumes CPU resources as if it was performing work. When blocking on a mutex, your thread does not consume CPU resources, so these can be used for a different thread to do work, or the CPU may throttle down, saving power. That's not possible with a spinlock, which is doing "active work" until it succeeds (or fails).
In the worst case, if the number of waiters is greater than the number of CPU cores, spinlocks may cause huge, dysproportionate performance impacts because the threads that are active and running are waiting on a condition that can never happen while they are running (since releasing the lock requires a different thread to run!).
On the other hand, one should expect every modern no-suck implementation of std::mutex
to already include a tiny spinlock before falling back to doing a syscall. But... while it is a reasonable assumption, this is not guaranteed.
Another non-technical reason for using spinlocks in favor of a std::mutex
may be license terms. License terms are a poor rationale for a design decision, but they may nevertheless be very real.
For example, the present GCC implementation is based exclusively on pthreads, which implies that "anything MinGW" using anything from the standard threads library necessarily links with winpthreads (lacking alternatives). That means you are subject to the winpthreads license, which implies you must reproduce their copyright message. For some people, that's a dealbreaker.
m.lock();
? What's the point of the busy wait? – Igor Tandetniktry_lock()
is about as expensive aslock()
, and you are getting the worst of both worlds - the cost of a trip to the kernel, and using your CPU as a space heater. – Igor Tandetnik