1
votes

I am looking for SIGBUS on unaligned data access. I am tracking one of this errors and I would like to know how this is happening on sitara am335x. Can someone please give me an example code to describe this or ensure triggering it.

Adding code snippet:

int Read( void *value, uint32_t *size, const uint32_t baseAddress )
{
    uint8_t *userDataAddress = (uint8_t *)( baseAddress + sizeof( DBANode ));
    memcpy( value, userDataAddress, ourDataSize );
    *size = ourDataSize;
    return 0;
}

DBA node is a class object of 20 bytes. baseAddress is an mmap to a shared memory file again of a class object type of DBANode casted to a uint32_t so that the arithmetic can be done.

This is the dissasembly of the section:

    91a8:   e51b3010    ldr r3, [fp, #-16]
    91ac:   e5933000    ldr r3, [r3]
    91b0:   e51b0014    ldr r0, [fp, #-20]  ; 0xffffffec
    91b4:   e51b1008    ldr r1, [fp, #-8]
    91b8:   e1a02003    mov r2, r3
    91bc:   ebffe72b    bl  2e70 <memcpy@plt>
    91c0:   e51b3010    ldr r3, [fp, #-16]
    91c4:   e5932000    ldr r2, [r3]
    91c8:   e51b3018    ldr r3, [fp, #-24]  ; 0xffffffe8
    91cc:   e5832000    str r2, [r3]

00002e70 <memcpy@plt>:
    2e70:   e28fc600    add ip, pc, #0, 12
    2e74:   e28cca08    add ip, ip, #8, 20  ; 0x8000
    2e78:   e5bcf868    ldr pc, [ip, #2152]!    ; 0x868

When the exact same code base was re-built, the problem just disappeared. Can the gcc create 2 different versions of instructions with same optimization of -O0 specified for gcc ?

I also diffed the library so files obj dumps in both compilations. They are exactly the same. The api is used quite often. However, the crash only happens after prolonged use over a few days. I am reading the same node every 500ms. So this is not consistent. Should I be looking at pointer corruption ?

2
Since you want a specific type of answer this is maybe not a duplicate, but it should give you some insight to your problem - stackoverflow.com/a/18269440/1730895 I wrestled with these bugs porting Win32 code to PocketPC, sometimes they're tricky to solve. The best start is to ensure your structures are packed correctly. Then be careful how you cast and access those members.Kingsley
Its complicated with the burden of legacy code and 3rd party components. But right now I'll settle for a baseline test code. The actual code is wrapped in cpp objects with a lot of type casting and templates with dynamic memory allocation.preetam
Do you know if this can happen on a memcpy? My core file actually traces to a memcpy but I am not able to reproduce it. Unfortunately, I am tracing a production unit which means that symbols are actually stripped.preetam
How are you memcpy()ing, and what are you memcpy()ing - it all comes back to the aligment. Do you need to copy the block like this? What about writing a structure-copying function (perhaps set to be inline), that does a member by member copy. Can you add the problematic data structure to your question?Kingsley
I'll add in that infopreetam

2 Answers

2
votes

Turns out the baseAddress is the issue. As I mentioned its an mmap to an shared memory location where the mmap can fail. failed mmap returns -1 and the code was checking for NULL and proceeding to write to -1 i.e 0xFFFFFFFF causing a sigbus. The code 1 is seen when we use memcpy. Trying any other access like a direct byte addressing gives a code 3 with sigbus.

I am still not sure why it triggers SIGBUS instead of SIGSEGV. Shouldn't this be a memory violation instead ? Here is an example:

int main(int argc, char **argv)
{
    // Shared memory example                                                    
     const char *NAME = "SharedMemory";                                          
     const int SIZE = 10 * sizeof(uint8_t);                                      
     uint8_t src[]={0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99,0x00};          
     int shm_fd = -1;                                                            

     shm_fd = shm_open(NAME, O_CREAT | O_RDONLY, 0666);                          
     ftruncate(shm_fd, SIZE);                                                    

    // Map shared memory segment to address space                               
     uint8_t *ptr = (uint8_t *) mmap(0, SIZE, PROT_READ | PROT_WRITE | _NOCACHE, MAP_SHARED, shm_fd, 0);
     if(ptr == MAP_FAILED)                                                       
     {                                                                           
          std::cerr << "ERROR in mmap()" << std::endl;                            
      //  return -1;                                                              
      }                                                                           
      printf("ptr = 0x%08x\n",ptr);                                               
      std::cout << "Now storing data to mmap() memory" << std::endl;              
      #if 0                                                                           
      ptr[0] = 0x11;                                                              
      ptr[1] = 0x22;                                                              
      ptr[2] = 0x33;                                                              
      ptr[3] = 0x44;                                                              
      ptr[4] = 0x55;                                                              
      ptr[5] = 0x66;                                                              
      ptr[6] = 0x77;                                                              
      ptr[7] = 0x88;                                                              
      ptr[8] = 0x99;                                                              
      ptr[9] = 0x00;                                                              
      #endif                                                                          

      memcpy(ptr,src,SIZE);   //causes sigbus code 1                              
      shm_unlink(NAME);
}

I still do not know why mmap is failing on an shm even though I have a 100MB of RAM available and all my resource limits are set to unlimited with over 400 fds (file descriptors) still available out of 1000 fds limit. !!!

1
votes

From the Cortex-A8 Technical Reference Manual:

The processor supports loads and stores of unaligned words and halfwords. The processor makes the required number of memory accesses and transfers adjacent bytes transparently.

Note Data accesses that cross a word boundary can add to the access time.

Setting the A bit in the CP15 c1 Control Register enables alignment checking. When the A bit is set to 1, two types of memory access generate a Data Abort signal and an Alignment fault status code:

  • a 16-bit access that is not halfword-aligned

  • a 32-bit load or store that is not word-aligned

Alignment fault detection is a mandatory address-generation function rather than an optionally supported function of external memory management hardware.See the ARM Architecture Reference Manual for more information on unaligned data access support.

From the ARM ARM, instructions which always generate an alignment fault if not aligned to the transfer size: LDREX, STREX, LDREXD, STREXD, LDM, STM, LDRD, RFE, SRS, STRD, SWP, LDC, LDC2, STC, STC2, VLDM, VLDR, VPOP, VPUSH, VSTM, VSTR.

Also, most PUSH, POP and VLDx where :align: is specified.

Further,

In an implementation that includes the Virtualization Extensions, an unaligned access to Device or Strongly-ordered memory always causes an Alignment fault Data Abort exception

As in the linked question, structs are the most obvious way to cause 'intended' unaligned accesses, but any corruption of the stack pointer or other variable pointer would also give the same result. Depending on how the core is configured will affect if normal single word/halfword accesses are just slow, or trigger a fault.

If you have access to the ETM trace, you would be able to identify the exact accesses. It seems that part has ETM/ETB (so no fancy trace capture device is required), but I've no idea how easy it will be to get tools to work with it.

As regards what code can trigger this, yes, even memcpy() could be a problem. Since the ARM instruction set has optimisations for transferring multiple registers (or register pairs in AA64), the optimised library functions will prefer to 'stream' data rather than perform byte by byte load and stores. Depending on the data structures and compilation target, it is perfectly possible to end up with illegal LDM to unaligned addresses.