1
votes

I do not quite understand the benefit of "multiple independent virtual address, which point to the same physical address", even though I read many books and posts,

E.g.,in a similar question Difference between physical addressing and virtual addressing concept,

The post claims that program will not crash each other, and

"in general, a particular physical page only maps to one application's virtual space"

Well, in http://tldp.org/LDP/tlk/mm/memory.html, in section "shared virtual memory", it says

"For example there could be several processes in the system running the bash command shell. Rather than have several copies of bash, one in each processes virtual address space, it is better to have only one copy in physical memory and all of the processes running bash share it."

If one physical address (e.g., shell program) mapped to two independent virtual addresses, how can this not crash? Wouldn't it be the same as using the physical addressing?

what does virtual addressing provide, which is not possible or convenient from physical addressing? If no virtual memory exists, i.e., two directly point to the same physical memory? i think, by using some coordinating mechanism, it can still work. So why bother "virtual addressing, MMU, virtual memory" these stuff?

2

2 Answers

2
votes

If one physical address (e.g., shell program) mapped to two independent virtual addresses

Multiple processes can be built to share a piece of memory; e.g. with one acting as a server that writes to the memory, the other as a client reading from it, or with both reading and writing. This is a very fast way of doing inter-process communication (IPC). (Other solutions, such as pipes and sockets, require copying data to the kernel and then to the other process, which shared memory skips.) But, as with any IPC solution, the programs must coordinate their reads and writes to the shared memory region by some messaging protocol.

Also, the "several processes in the system running the bash command shell" from the example will be sharing the read-only part of their address spaces, which includes the code. They can execute the same in-memory code concurrently, and won't kill each other since they can't modify it.

In the quote

in general, a particular physical page only maps to one application's virtual space

the "in general" part should really be "typically": memory pages are not shared unless you set them up to be, or unless they are read-only.

4
votes

There are two main uses of this feature.

First, you can share memory between processes, that can communicate via the shared pages. In facts, shared memory is one of the simplest forms of IPC.

But shared readonly pages can also be used to avoid useless duplication: most of times, the code of a program does not change after it has been loaded in memory, so its memory pages can be shared among all the processes that are running that program. Obviously only the code is shared, the memory pages containing the stack, the heap and in general the data (or, if you prefer, the state) of the program are not shared.

This trick is improved with "copy on write". The code of executables usually doesn't change when running, but there are programs that are actually self-modifying (they were quite common in the past, when most of the development was still done in assembly); to support this stuff, the operating system does read-only sharing as explained before, but, if it detects a write on one of the shared pages, it disables the sharing for such page, creating an independent copy of it and letting the program write there.

This trick is particularly useful in situations in which there's a good chance that the data won't change, but it may happen.

Another case in which this technique can be used is when a process forks: instead of copying every memory page (which is completely useless if the child process does immediately an exec) , the new process shares with the parent all its memory pages in copy on write mode, allowing quick process creation, still "faking" the "classic" fork behavior.