What's the difference between the message passing and shared memory models?

Question

The question duplicates the topic of the question asked here.

I would like to ask for some additional clarification with respect to a different point of view.

in distributed computing, memory coherency is in the end implemented using message passing over network channels, with distributed locking and so forth. Message passing, IIUC, would not always eliminate concurrency, except at the very low level, because the processes still usually affect each others state. And they do so, in what they believe to be consistent way.

For example, a simple command interpreter can be implemented on top of message passing, and the commands can be sent as part of several remote transactions executed in parallel from multiple conversant processes. So, high level interactions would require design for concurrency in most cases. That is, IMO, it is very unlikely that processes have no transaction semantics of long term operations.

Additionally, sending a message, with a value in consistent state does not guarantee correctness. It is important, how this value is produced and what happens in between the messages that provided the input data and the messages that publish the transformed result.

On the other hand, low level interactions with physical memory are always essentially some kind of message passing over buses. So, at the lowest level, sharing memory and message passing are identical.

And at the per-instruction level, the atomicity of aligned loads and stores is usually guaranteed. So, the distinction is still blurry to me.

In prose, how does the choice of the shared memory vs message passing relate to concurrency? Is it just a matter of choice of technical pattern to solving concurrency, and mathematical model for defining and analyzing interactions of parallel processes, or are those techniques also architectural patterns that when applied systematically, fundamentally affect the concurrency issues in the system?

Thanks.

Edit (some additional remarks):

Apparently, the two methods are distinguished by correctness and performance. But I have the following problems with this distinction.

I know that messages can act like transfers of big scattered virtual datums. However, the "by-value" approach does not guarantee consistency without synchronization beyond the atomic read of non-unitary (or procedurally generated) logical datum. By consistency, I imply something like causality or sequential ordering of the changes, etc. With message passing, indeed, every process only mutates its own memory. A process acts just like a controller to its private memory. This is like sharing on top of message passing, serialized by the process owning the datum, but on a MESSAGE-BY-MESSAGE basis (similar to how memory serializes on a word-by-word or cache-line-by-cache-line basis). It remains responsibility of the application programmer to guarantee synchronization of the transactions involved in sending the messages. Namely, that messages to one process, from multiple conversant processes, must be sent in consistent order corresponding to the semantics of the operations those processes are executing. May be with control messages to the owning process, or through coordination directly between the contenders, but some restriction to the concurrency of the messages should be most likely necessary.

Sharing memory can be indeed faster for local inter-process communication (ignoring contention), but why would this be the case for cross-machine communication? Shared memory for distributed computing is implemented on top of the network communication. So, shared memory, aside from caching benefits, can not be faster.

The techniques are obviously different. What I can't seem to understand is how they can be broadly compared to each other, when there is nothing intrinsically beneficial to either one. One must assume what the platform supplies, and what the software tries to accomplish, and such assumption can not be universally true.

You have to define what exactly your mean by message passing and also shared memory concurrency. The terms are not well-defined and means different things for different people. In addition, depending on what view you take, SM is MP and MP is SM. — I GIVE CRAP ANSWERS
@I GIVE CRAP ANSWERS: I understand. But if you look at the other question, you will see that it also had no clarifications. It has an accepted answer that seems to be pretty straightforward in its definition and agrees with the distinction given in the wikipedia article. The problem is that I don't understand what are the major implications of this distinction. And is further used for differentiation of parallel and distributed computing (which could otherwise be defined with phrases such as "in one box" vs "among multiple boxes.") — simeonz
@I GIVE CRAP ANSWERS: In a sense your comment is sort of an answer to my question, but the terms are used very frequently in similar contexts and I assumed that they are less overloaded and free from interpretation. — simeonz
Well, I am an Erlang programmer in my day job. We usually use Message Passing semantics, since they have certain advantages in a distributed environment on stock hardware. But do note that Erlang also supports a shared-memory tuple-space called ETS. I am afraid I have a hard time being more specific since to a C++ programmer, things may be differently applied as terms. — I GIVE CRAP ANSWERS

Arun Taylor Arun Taylor · Accepted Answer · 2012-12-10T02:21:45

If you are architecting a distributed and/or multi-threading application, you would want to ensure that it performs better than a single process single thread application.

With distributed applications, i.e. multiple processes on potentially multiple systems, latency between communication nodes is a prime concern. With the advent of microservers, latency as well as power consumption goes down significantly to the point where it behooves software developers to start thinking about how to design, develop, debug, deploy, etc. multi-core/microserver applications.

When developing multi-process applications, it usually boils down to using two sets of OS calls at the lowest layer to implement inter-process communication: shared memory, e.g. by using shmget, shmat, shmctl, etc., and, message passing, e.g. by using socket, accept, send, recv, etc.

With shared memory, latency is negligible. Once a reference to a shared memory buffer is obtained, an application can go to any part of the shared memory and modify it. Of course, processes have to cooperate using locks, mutexes, etc. to ensure that integrity of the data structures is maintained and that the application works correctly. The problem with this solution is, how do you test for all situations that integrity is maintained when there is no control over when a context switch may occur?

With message passing, no data is shared. All communication is by means of exchanging buffers. This eliminates having to be concerned with locks, mutexes, etc., but now one has to ensure that the application can handle issues such as network timeouts,bandwidth, latency, etc.

In order to develop apps that can scale beyond a single system, the most common method is to use message passing. If the communicating processes happen to be on the same host, it still works.

Irrespective of whether it is shared memory or message passing, concurrency in the end is essentially about ensuring the integrity of data structures with locks/mutexes in case of shared memory and serializing request/response in case of message passing.

What's the difference between the message passing and shared memory models?

1 Answers