FreeBSD has some great notes on this topic, and may be a great case study. You can read more here:
http://www.freebsd.org/doc/en/books/arch-handbook/smp-design.html
On the topic of locking, they're more careful to use finer grained locking, such that preemption will likely only occur outside of holding a lock.
In the cases where preempting a lock-holding thread will cause correctness issues, they mention that they have an API for indicating that the thread is temporarily in a non-interruptible region:
While locks can protect most data in the case of a preemption, not all
of the kernel is preemption safe. For example, if a thread holding a
spin mutex preempted and the new thread attempts to grab the same spin
mutex, the new thread may spin forever as the interrupted thread may
never get a chance to execute. Also, some code such as the code to
assign an address space number for a process during exec on the Alpha
needs to not be preempted as it supports the actual context switch
code. Preemption is disabled for these code sections by using a
critical section.
And:
The responsibility of the critical section API is to prevent context
switches inside of a critical section. With a fully preemptive kernel,
every setrunqueue of a thread other than the current thread is a
preemption point. One implementation is for critical_enter to set a
per-thread flag that is cleared by its counterpart. If setrunqueue is
called with this flag set, it does not preempt regardless of the
priority of the new thread relative to the current thread. However,
since critical sections are used in spin mutexes to prevent context
switches and multiple spin mutexes can be acquired, the critical
section API must support nesting. For this reason the current
implementation uses a nesting count instead of a single per-thread
flag.
So in your example, if you have a preemptable thread that hold locks important to the scheduler, you might mark that thread as temporarily unpreemptable.
You can find parallels for this approach even in application level software. .Net has a concept called Constrained Execution Regions [1] [2] [3], which, while they don't have to do with scheduling, are used to signal to the VM that some block of code that is about to execute must execute 'atomically', and that the VM should postpone any Thread.Abort()s it may perform, as well as ensuring that the code can complete (ensure methods are already JIT'd and that there's enough stack space). Different purpose, but a similar rough idea - tell the scheduling overlords "if you interrupt me in a weird way, you may break correctness".
Of course, in the case of kernel preemption or .Net CERs, it is up to the developer to properly identify all of the areas where crtical execution regions are occurring to ensure that certain lock invariants are enforced.
FreeBSD has facilities they use to aid in debugging these sort of problems to help identify, eg, deadlocks. One particular technique is lock ordering - each lock is given a particular priority; as you grab locks, you record whats the current lock priority. Then, if you ever attempt to grab a lock that is lower than the current priority, you know that you've violated lock ordering and that you should inform the user by recording an operating system fault.
The need for lock ordering may not be immediately apparent, but consider the popular example of deadlocking - 2 resources that are protected by two locks:
Thread A wants to grab lock 1 and 2.
Thread B wants to grab lock 1 and 2.. but:
- Thread A grabs lock 1
- Thread B grabs lock 2
- Thread A tries to grab lock 2, but can't
- Thread B tries to grab lock 1, but can't.
- Deadlock is achieved.
The threads did not grab locks in a consistent order. Thread B should actually notice that he's holding lock 2 while trying to grab lock 1; since 1 < 2, he's grabbing locks out of order and should abort. Viola, deadlocks are avoided, or at least discovered and repairable.