6
votes

I m thinking to use Posix robust mutexes to protect shared resource among different processes (on Linux). However there are some doubts about safety in difference scenarios. I have the following questions:

  1. Are robust mutexes implemented in the kernel or in user code?

  2. If latter, what would happen if a process happens to crash while in a call to pthread_mutex_lock or pthread_mutex_unlock and while a shared pthread_mutex datastructure is getting updated?

    I understand that if a process locked the mutex and dies, a thread in another process will be awaken and return EOWNERDEAD. However, what would happen if the process dies (in unlikely case) exactly when the pthread_mutex datastructure (in shared memory) is being updated? Will the mutex get corrupted in that case? What would happen to another process that is mapped to the same shared memory if it were to call a pthread_mutex function? Can the mutex still be recovered in this case?

  3. This question applies to any pthread object with PTHREAD_PROCESS_SHARED attribute. Is it safe to call functions like pthread_mutex_lock, pthread_mutex_unlock, pthread_cond_signal, etc. concurrently on the same object from different processes? Are they thread-safe across different processes?

2
I'm confused by this question. Why would a POSIX standardized API exist if it weren't useful for it's intended purpose? A bit of common sense would quickly evaporate your existential doubts, IMOsehe
I guess for my 3rd question from the common sense point of view these functions should be thread-safe. For the 2nd question though it's not that simple, i don't see how shared memory corruption would be avoided if the process happens to crash at the wrong moment.Yevgeniy P
The OS designers, however, do know how to avoid it. A process might appear to "crash at any moment". In reality, though, it's always a definable moment, and interrupt and the OS can handle that like it does any other. Assuming standard reliable hardware. (The fact that the "process dies" doesn't mean catastrophe at the kernel level: it's just a sub routine with a lot of resources, in essence. Threads themselves, don't even exist as such! They're just stackful abstractions.)sehe
I see that this would be true for objects kept in the kernel like file locks or System V semaphores, but pthread mutex datastructures are kept in user space. Would that be still true for them too?Yevgeniy P
You don't actually know where the datastructures are going to be kept. Regardless of that, the OS governs the process memory even after the process is terminated, so there is no reason why the OS couldn't correctly handle the situation. I'll write an answer based on the relevant documentationsehe

2 Answers

11
votes

From the man-page for pthreads:

 Over time, two threading implementations have been provided by the
   GNU C library on Linux:

   LinuxThreads
          This is the original Pthreads implementation.  Since glibc
          2.4, this implementation is no longer supported.

   NPTL (Native POSIX Threads Library)
          This is the modern Pthreads implementation.  By comparison
          with LinuxThreads, NPTL provides closer conformance to the
          requirements of the POSIX.1 specification and better
          performance when creating large numbers of threads.  NPTL is
          available since glibc 2.3.2, and requires features that are
          present in the Linux 2.6 kernel.

   Both of these are so-called 1:1 implementations, meaning that each
   thread maps to a kernel scheduling entity.  Both threading
   implementations employ the Linux clone(2) system call.  In NPTL,
   thread synchronization primitives (mutexes, thread joining, and so
   on) are implemented using the Linux futex(2) system call.

And from man futex(7):

   In its bare form, a futex is an aligned integer which is touched only
   by atomic assembler instructions.  Processes can share this integer
   using mmap(2), via shared memory segments or because they share
   memory space, in which case the application is commonly called
   multithreaded.

An additional remark found here:

(In case you’re wondering how they work in shared memory: Futexes are keyed upon their physical address)

Summarizing, Linux decided to implement pthreads on top of their "native" futex primitive, which indeed lives in the user process address space. For shared synchronization primitives, this would be shared memory and the other processes will still be able to see it, after one process dies.

What happens in case of process termination? Ingo Molnar wrote an article called Robust Futexes about just that. The relevant quote:

Robust Futexes

There is one race possible though: since adding to and removing from the list is done after the futex is acquired by glibc, there is a few instructions window for the thread (or process) to die there, leaving the futex hung. To protect against this possibility, userspace (glibc) also maintains a simple per-thread 'list_op_pending' field, to allow the kernel to clean up if the thread dies after acquiring the lock, but just before it could have added itself to the list. Glibc sets this list_op_pending field before it tries to acquire the futex, and clears it after the list-add (or list-remove) has finished


Summary

Where this leaves you for other platforms, is open-ended. Suffice it to say that the Linux implementation, at least, has taken great care to meet our common-sense expectation of robustness.

Seeing that other operating systems usually resort to Kernel-based synchronization primitives in the first place, it makes sense to me to assume their implementations would be even more naturally robust.

0
votes

Following the documentation from here: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getrobust.html, it does read that in a fully POSIX compliant OS, shared mutex with the robust flag will behave in the way you'd expect.

The problem obviously is that not all OS are fully POSIX compliant. Not even those claiming to be. Process shared mutexes and in particular robust ones are among those finer points that are often not part of an OS's implementation of POSIX.