6
votes

I have loaded an idt table with 256 entries, all pointing to similar handlers:

  • for exceptions 8 and 10-14, push the exception number (these exceptions push an error code automatically)
  • for the others, push a "dummy" error code and the exception number;
  • then jump to a common handler

So when the common handler enters, the stack is properly aligned and contains the exception/interrupt number, error code (which may just be a dummy), eflags, cs and eip.

My question regards returning from the interrupt handler. I use iret to return after taking out the exception number and the error code from the stack, but this doesn't work for exception nr 8; if I leave the error code on the stack, then it returns fine!

Questions:

  • do I have to leave the error code on the stack for exceptions that put the error code there? If so, how does iret determine whether it has to pop an error code or not?
  • as soon as I enable interrupts I always get exception 8 (double fault), but then everything runs fine (I'm developing a hobby OS). Is this normal behavior or do I have a bug somewhere?
4
Also, pointers to the intel manuals would be most welcome :) I haven't found anything regarding these problems there yet.Joao da Silva

4 Answers

13
votes

If the CPU pushed an error code automatically, the handler must pop it before the iret. The iret instruction doesn't know where you're coming from, if it's a fault, a trap or an external interrupt. It always does the same, and it assumes that there's no error code on the stack.

Quoting from the SDM (Software Developer's Manual), Volume 3, Chapter 5, section 5.13 titled Error Code:

The error code is pushed on the stack as a doubleword or word (depending on the default interrupt, trap, or task gate size). To keep the stack aligned for doubleword pushes, the upper half of the error code is reserved. Note that the error code is not popped when the IRET instruction is executed to return from an exception handler, so the handler must remove the error code before executing a return.

You can find the IA-32 Software Developer's Manual here: http://www.intel.com/products/processor/manuals/

Volume 3 part 1, chapter 5, describes exception and interrupt handling. Volume 2 part 1 has the spec for the iret instruction.

1
votes

I wrote a small x86 OS a while back. Take a look at the file isr.asm in the cvs repository.

Notice how we set up the handlers, most push a dummy dword onto the stack to account for the few handlers that automatically get an error code pushed. Then when we return via an iret we can always assume 2 dwords on the stack irrespective of the interrupt and perform an add esp, 8 before the iret to clean things up nicely.

That should answer your first question.

As for your second question: A double fault when you enable interrupts, ...hmmm could be a problem with paging if you haven't set it up correctly. Could be a million other thing too :)

1
votes

I had a similar problem with "double faults" as soon as I enabled interrupts. Well, they looked like double faults, but they really were timer interrupts!

Double faults are interrupt number 8.

Unfortunately, a default PIC configuration signals timer interrupts as interrupt number (DEFAULT_PIC_BASE + TIMER_OFFSET) = (8 + 0) = 8.

Masking out all my PIC interrupts (until I was ready to properly configure the PIC) silenced these double-fault-lookalike timer interrupts.

(PICs require the CPU to acknowledge interrupts before they produce the next one. Since your code wasn't acknowledging the initial timer interrupt, the PIC never gave you any more! That's why you only got one, rather than the zillion one might have expected.)

0
votes

Do I have to leave the error code on the stack for exceptions that put the error code there?

As others mentioned, you have to do either:

pop %eax
/* Do something with %eax */
iret

Or if you want to ignore the error code:

add $4, %esp
iret

If you don't, iret will interpret the error code as the new CS, and you are likely to get a general protection fault as mentioned at: Why does iret from a page fault handler generate interrupt 13 (general protection fault) and error code 0x18?

Minimal Working this page handler that I've created to illustrate this. Try commenting out the pop and see it blow up.

Compare the above with a Division error exception which does not to pop the stack.

Note that if you do simply int $14, no extra byte gets pushed: this only happens on the actual exception.

Intel Manual Volume 3 System Programming Guide - 325384-056US September 2015 Table 6-1. "Protected-Mode Exceptions and Interrupts" column "Error Code" contains the list of interrupts that push the error code or not.

38.9.2.2 "Page Fault Error Codes" explains what the error means.

A neat way to deal with this is to push a dummy error code 0 on the stack for the interrupts that don't do this to make things uniform. James Molloy's tutorial does exactly that.

The Linux kernel 4.2 seems to do something similar. Under arch/x86/entry/entry64.S it models interrupts with has_error_code:

trace_idtentry page_fault do_page_fault has_error_code=1

and then uses it on the same file as:

.ifeq \has_error_code
pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif

which does the push when has_error_code=0.