what type of code can trigger unaligned data access sigbus trap dynamically?

Question

I am looking for SIGBUS on unaligned data access. I am tracking one of this errors and I would like to know how this is happening on sitara am335x. Can someone please give me an example code to describe this or ensure triggering it.

Adding code snippet:

int Read( void *value, uint32_t *size, const uint32_t baseAddress )
{
    uint8_t *userDataAddress = (uint8_t *)( baseAddress + sizeof( DBANode ));
    memcpy( value, userDataAddress, ourDataSize );
    *size = ourDataSize;
    return 0;
}

DBA node is a class object of 20 bytes. baseAddress is an mmap to a shared memory file again of a class object type of DBANode casted to a uint32_t so that the arithmetic can be done.

This is the dissasembly of the section:

    91a8:   e51b3010    ldr r3, [fp, #-16]
    91ac:   e5933000    ldr r3, [r3]
    91b0:   e51b0014    ldr r0, [fp, #-20]  ; 0xffffffec
    91b4:   e51b1008    ldr r1, [fp, #-8]
    91b8:   e1a02003    mov r2, r3
    91bc:   ebffe72b    bl  2e70 <memcpy@plt>
    91c0:   e51b3010    ldr r3, [fp, #-16]
    91c4:   e5932000    ldr r2, [r3]
    91c8:   e51b3018    ldr r3, [fp, #-24]  ; 0xffffffe8
    91cc:   e5832000    str r2, [r3]

00002e70 <memcpy@plt>:
    2e70:   e28fc600    add ip, pc, #0, 12
    2e74:   e28cca08    add ip, ip, #8, 20  ; 0x8000
    2e78:   e5bcf868    ldr pc, [ip, #2152]!    ; 0x868

When the exact same code base was re-built, the problem just disappeared. Can the gcc create 2 different versions of instructions with same optimization of -O0 specified for gcc ?

I also diffed the library so files obj dumps in both compilations. They are exactly the same. The api is used quite often. However, the crash only happens after prolonged use over a few days. I am reading the same node every 500ms. So this is not consistent. Should I be looking at pointer corruption ?

Since you want a specific type of answer this is maybe not a duplicate, but it should give you some insight to your problem - stackoverflow.com/a/18269440/1730895 I wrestled with these bugs porting Win32 code to PocketPC, sometimes they're tricky to solve. The best start is to ensure your structures are packed correctly. Then be careful how you cast and access those members. — Kingsley
Its complicated with the burden of legacy code and 3rd party components. But right now I'll settle for a baseline test code. The actual code is wrapped in cpp objects with a lot of type casting and templates with dynamic memory allocation. — preetam
Do you know if this can happen on a memcpy? My core file actually traces to a memcpy but I am not able to reproduce it. Unfortunately, I am tracing a production unit which means that symbols are actually stripped. — preetam
How are you memcpy()ing, and what are you memcpy()ing - it all comes back to the aligment. Do you need to copy the block like this? What about writing a structure-copying function (perhaps set to be inline), that does a member by member copy. Can you add the problematic data structure to your question? — Kingsley

preetam preetam · Accepted Answer · 2019-05-02T18:14:53

Turns out the baseAddress is the issue. As I mentioned its an mmap to an shared memory location where the mmap can fail. failed mmap returns -1 and the code was checking for NULL and proceeding to write to -1 i.e 0xFFFFFFFF causing a sigbus. The code 1 is seen when we use memcpy. Trying any other access like a direct byte addressing gives a code 3 with sigbus.

I am still not sure why it triggers SIGBUS instead of SIGSEGV. Shouldn't this be a memory violation instead ? Here is an example:

int main(int argc, char **argv)
{
    // Shared memory example                                                    
     const char *NAME = "SharedMemory";                                          
     const int SIZE = 10 * sizeof(uint8_t);                                      
     uint8_t src[]={0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99,0x00};          
     int shm_fd = -1;                                                            

     shm_fd = shm_open(NAME, O_CREAT | O_RDONLY, 0666);                          
     ftruncate(shm_fd, SIZE);                                                    

    // Map shared memory segment to address space                               
     uint8_t *ptr = (uint8_t *) mmap(0, SIZE, PROT_READ | PROT_WRITE | _NOCACHE, MAP_SHARED, shm_fd, 0);
     if(ptr == MAP_FAILED)                                                       
     {                                                                           
          std::cerr << "ERROR in mmap()" << std::endl;                            
      //  return -1;                                                              
      }                                                                           
      printf("ptr = 0x%08x\n",ptr);                                               
      std::cout << "Now storing data to mmap() memory" << std::endl;              
      #if 0                                                                           
      ptr[0] = 0x11;                                                              
      ptr[1] = 0x22;                                                              
      ptr[2] = 0x33;                                                              
      ptr[3] = 0x44;                                                              
      ptr[4] = 0x55;                                                              
      ptr[5] = 0x66;                                                              
      ptr[6] = 0x77;                                                              
      ptr[7] = 0x88;                                                              
      ptr[8] = 0x99;                                                              
      ptr[9] = 0x00;                                                              
      #endif                                                                          

      memcpy(ptr,src,SIZE);   //causes sigbus code 1                              
      shm_unlink(NAME);
}

I still do not know why mmap is failing on an shm even though I have a 100MB of RAM available and all my resource limits are set to unlimited with over 400 fds (file descriptors) still available out of 1000 fds limit. !!!

what type of code can trigger unaligned data access sigbus trap dynamically?

2 Answers