4
votes

Which takes longer time?

Switching between the user & kernel modes (or) switching between two processes?

Please explain the reason too.

EDIT : I do know that whenever there is a context switch, it takes some time for the dispatcher to save the status of the previous process in its PCB, and then reload the next process from its corresponding PCB. And for switching between the user and the kernel modes, I know that the mode bit has to be changed. Isn't it all, or is there more to it?

1
This sounds like a homework question to me...John Brodie
not really.. i'm learning for my masters entrance exam :)user1956389
Newer processors have optimized sysenter/exit instructions for faster user/kernel switches. Context switch between processes also implies switching page tables and flushing caches and TLBs.Nikolai Fetissov
Well, I would think that, since the kernel is what facilitates switching from one process to the next, that the answer might be rather obvious. After all, you need to switch from user land to kernel mode for the scheduler (which is part of the kernel) to take control, then, after it makes its decisions, you have a switch from kernel to user land in the new process... So switching process to process likely takes at least twice as long as user to kernel, at least in the general case...twalberg
Hmm, looks like wrong assumption. It's not like user code initiates a switch from process to process. Then the switch itself might be to a process blocked inside a system call, i.e. already running in the kernel mode.Nikolai Fetissov

1 Answers

7
votes

Switching between processes (given you actually switch, not run them in parallel) by an order of oh-my-god.

Trapping from userspace to kernelspace used to be done with a processor interrupt earlier. Around 2005 (don't remember the kernel version), and after a discussion on the mailing list where someone found that trapping was slower (in absolute measures!) on a high-end xeon processor than on an earlier Pentium II or III (again, my memory), they implemented it with a new cpu instruction sysenter (which had actually existed since Pentium Pro I think). This is done in the Virtual Dynamic Shared Object (vdso) page in each process (cat /proc/pid/maps to find it) IIRC.

So, nowadays, a kernel trap is basically just a couple of cpu instructions, hence rather few cycles, compared to tenths or hundreds of thousands when using an interrupt (which is really slow on modern CPU's).

A context switch between processes is heavy. It means storing all processor state (registers, etc) to RAM (at a magic memory location in the user process space actually, guess where!), in practice dirtying all cached memory in the cpu, and reading back the process state for the new process. It will (likely) have nothing still in the cpu cache from last time it ran, so each memory read will be a cache miss, and needed to be read from RAM. This is rather slow. When I was at the university, I "invented" (well, I did come up with the idea, knowing that there is plenty of dye in a CPU, but not enough cool if it's constantly powered) a cache that was infinite size although unpowered when unused (only used on context switches i.e.) in the CPU, and implemented this in Simics. Implemented support for this magic cache I called CARD (Context-switch Active, Run-time Drowsy) in Linux, and benchmarked rather heavily. I found that it could speed-up a Linux machine with lots of heavy processes sharing the same core with about 5%. This was at relatively short (low-latency) process time slices, though.

Anyway. A context switch is still pretty heavy, while a kernel trap is basically free.

Answer to at which memory location in user-space, for each process:

At address zero. Yep, the null pointer! You can't read from this entire page from user-space anyway :) This was back in 2005, but it's probably the same now unless the CPU state information has grown larger than a page size, in which case they might have changed the implementation.