4
votes

I used to think that x86-64 supports unaligned memory access and invalid memory access always causes segmentation fault (except, perhaps, SIMD instructions like movdqa or movaps). Nevertheless recently I observed bus error with normal mov instruction. Here is a reproducer:

void test(void *a)
{
    asm("mov %0, %%rbp\n\t"
        "mov 0(%%rbp), %%rdx\n\t"
        : : "r"(a) : "rbp", "rdx");
}

int main()
{
    test((void *)0x706a2e3630332d69);
    return 0;
}

(must be compiled with frame pointer omission, e.g. gcc -O test.c && ./a.out).

mov 0(%rbp), %rdx instruction and the address 0x706a2e3630332d69 were copied from a coredump of the buggy program. Changing it to 0 causes segfault, but just aligning to 0x706a2e3630332d60 is still bus error (my guess is that it is related to the fact that address space is 48-bit on x86-64).

The question is: which addresses cause bus error (SIGBUS)? Is it determined by architecture or configured by OS kernel (i.e. in page table, control registers or something similar)?

3
This doesn't take ASLR into account. The address may not be meaningful between executions.Brett Hale

3 Answers

12
votes

SIGBUS is in a sad state. There's no consensus between different operating systems what it should mean and when it is generated varies wildly between operating systems, cpu architectures, configuration and the phase of the moon. Unless you work with a very specific configuration you should just treat it "just like SIGSEGV, but different".

I suspect that originally it was supposed to mean "you tried a memory access that could not possibly be successful no matter what the kernel does", so in other words the exact bit pattern you have in the address can never be a valid memory access. Most commonly this would mean unaligned access on strict alignment architectures. Then some systems started using it for accesses to virtual address space that doesn't exist (like in your example, the address you have can't exist). Then by accident some systems made it also mean that userland tried to touch kernel memory (since at least technically it's virtual address space that doesn't exist from the point of view of userland). Then it became just random.

Other than that I've seen SIGBUS from:

  • access to non-existent physical address from mmap:ed hardware.
  • exec of non-exec mapping
  • access to perfectly valid mapping, but overcommitted memory couldn't be faulted in at this moment (I've seen SIGSEGV, SIGKILL and SIGBUS here, at least one operating system does this differently depending on which architecture you're on).
  • memory management deadlocks (and other "something went horribly wrong, but we don't know what" memory management errors).
  • stack red zone access
  • hardware errors (ECC memory, pci bus parity errors, etc.)
  • access to mmap:ed file where the file contents don't exist (past the end of the file or a hole).
  • access to mmap:ed file where the file contents should exist, but don't (I/O errors).
  • access to normal memory that got swapped out and swap in couldn't be performed (I/O error).
5
votes

Generally, a SIGBUS can be sent on an unaligned memory access, i.e. when writing a 64-bit integer to an address, which is not 8-byte aligned. However, in recent systems. either the hardware itself handles it correctly (albeit a bit slower than an aligned access), or the OS emulates the access it in an exception handler (with 2 or more separate memory accesses).

In this case, the problem is, that an address outside the permissible virtual address address space was specified. Despite a pointer has 64-bit, only the address space from 0-(2^48-1) (0x0-0xffffffffffff) is valid on current 64-bit intel processors. Linux provides even less address space to its processes, from 0-(2^47-1) (which is 0-0x7fffffffffff), the rest (0x800000000000-0xffffffffffff) is used by the kernel.

This means, that the kernel sends a SIGBUS because of an access to an invalid address (every address >= 0x800000000000), as opposed to a SIGSEGV, which means, that an access error to a valid address occurred (missing page entry, wrong access rights, etc.).

2
votes

The only situation where POSIX specifically requires generation of a SIGBUS is, when you create a file-backed mmap region that extends beyond the end of the backing file by more than a whole page, and then access addresses sufficiently far past the end. (The exact words are "References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.", from the specification of mmap.)

In all other circumstances, whether you get a SIGSEGV or a SIGBUS for an invalid memory access, or no signal at all, is left completely up to the implementation.