15
votes

Let's say we have a CPU with 20 cores and a process with 20 CPU-intensive independent of each other threads: One thread per CPU core. I'm trying to figure out whether context switching happens in this case. I believe it happens because there are system processes in the operating system that need CPU-time too.

I understand that there are different CPU architectures and some answers may vary but can you please explain:

  • How context switching happens e.g. on Linux or Windows and some known CPU architectures? And what happens under the hood on modern hardware?
  • What if we have 10 cores and 20 threads or the other way around?
  • How to calculate how many threads we need if we have n CPUs?
  • Does CPU cache(L1/L2) gets empty after context switching?

Thank you

1

1 Answers

20
votes

How context switching happens e.g. on Linux or Windows and some known CPU architectures? And what happens under the hood on modern hardware?

A context-switch happens when an interrupt occurs and that interrupt, together with the kernel thread and process state data, specify a set of running threads that is different than the set running before the interrupt. Note that, in OS terms, an interrupt may be either a 'real' hardware interrupt that causes a driver to run and that driver requests a scheduling run, or a syscall from a thread that is already running. In either case, the OS scheduling state-machine decides whether to change the set of threads running on the available cores.

The kernel can change the set of running threads by stopping thread/s and running others. It can stop any thread running on any core by queueing up a premption request and generating a hardware interrupt of that core to force the core to run its interprocessor driver to handle the request.

What if we have 10 cores and 20 threads?

Depends on what the threads are doing. If they are in any other state than ready/running, (eg blocked on I/O or inter-thread comms), there will be no context-switching between them because nothing is running. If they are all ready/running, 10 of them will run forever on the 10 cores until there is an interrupt. Most systems have a periodic timer interrupt that can have the effect of sharing the available cores around the threads.

or the other way around

10 threads run on 10 cores. The other 10 cores are halted. The OS may move the threads around the cores, eg. to prevent uneven heat dissipation across the die.

How to calculate how many threads we need if we have n CPUs?

App-dependent. It would be nice if all cores were always used up 100% on exactly as many ready threads as cores but, since most threads are blocked for much more time than they are running, it's difficult, except in some end-cases, (eg - your '20 CPU-intensive threads on 20 cores'), to come up with any optimal number.

Does CPU cache(L1/L2) gets empty after context switching?

Maybe - it depends entirely on the data usage of the threads. The caches will get reloaded on-demand, as usual. There is no 'context-switch total cache reload' but, if the threads access different, large arrays of data while running, then the (L1 at least), cache will indeed get fully reloaded during the thread run.