6
votes

On a bare metal system (embedded microcontroller, no MMU, no paging) what is more expensive? A full context switch (register save & restore) or a function call (activation record allocation)?

I understand that this is highly dependent on calling convention and hardware capability, but how would I go about evaluating this?

EDIT:

To provide more context, I'm trying to model two scheduling schemes. The first being a pre-emptive scheduler with context switching between tasks. The second being a function pointer run queue where tasks are state-machines broken into several enque-able function calls (where enqueing occurs on an IO event driven basis).

For the most part, I can gather good data on how long my tasks take (both IO and CPU time) but I need some help figuring out the additional overhead costs to add as constants in my model.

2
Are you trying to perform 1 specific activity and wondering which concept to use?nj-ath
@darknight - added new comments aboveJon

2 Answers

1
votes

Since the system calls that trigger context switches are function calls, and the hardware interrupts that can trigger context-switches are similar, (and require a call to an event/semaphore, and a jump/call to the scheduler entry point, to signal the context-switch), I would say that a function call would be cheaper CPU-cycle wise unless an unreasonable number of parameters were passed.

This smells like an XY problem - why do you ask this? Context switches and function calls are almost orthogonal - one is a stack-based mechanism, the other selects a different stack entirely.

0
votes

You'd go about evaluating this by contrasting the techniques and their actual affects on overall data movement.

For example, on a 6502, an interrupt pushes: The Program Counter, X, Y, A, and status register. That's 6 bytes of actual data, and takes 7 CPU cycles.

Granted, a 6502 is a much simpler CPU than modern designs, but sits as a fundamental example of the problem.

Now, a function call can arguably be as little as a Jump Subroutine, which simply pushes the current PC on to the stack, and then changes the PC to the new location. On the 6502, a JSR cost 6 cycles.

If you consider JSR and BRK (software interrupt on 6502) as the primitives, JSR is cheaper than BRK, by 1 cycle. This is outside the costs of standing up the call frame.

Most context switches are done automatically (via a timer or whatever) to simulate multi processing. But some systems use the CPUs trap primitive for system calls (like INT in MS-DOS, and TRAP in the old Mac OS). So a soft interrupt still has to stand up the stack frame, just like a normal subroutine would.

In the end, a JSR is likely cheaper than any of the higher level switch mechanisms, simply because it's so lightweight. The interrupt mechanisms usually have a indirection mechanic (which is why they're used in system call so much) that a subroutine does not have. The compiler managed the subroutine addresses at compile time.

But those are the considerations to look at to evaluate raw performance.