0
votes

I am using GCC to compile a program for an ARM Cortex M3.
My program results in a hardfault, and I am trying to troubleshoot it.

GCC version is 10.3.1 but I have confirmed this with older versions too (i.e. 9.2).

The hardfault occurs only when optimizations are enabled (-O3).

The problematic function is the following:

void XTEA_decrypt(XTEA_t * xtea, uint32_t data[2])
{
    uint32_t d0 = data[0];
    uint32_t d1 = data[1];
    uint32_t sum = XTEA_DELTA * XTEA_NUMBER_OF_ROUNDS;

    for (int i = XTEA_NUMBER_OF_ROUNDS; i != 0; i--)
    {
        d1 -= (((d0 << 4) ^ (d0 >> 5)) + d0) ^ (sum + xtea->key[(sum >> 11) & 3]);
        sum -= XTEA_DELTA;
        d0 -= (((d1 << 4) ^ (d1 >> 5)) + d1) ^ (sum + xtea->key[sum & 3]);
    }

    data[0] = d0;
    data[1] = d1;
}

I noticed that the fault happens in line:

    data[0] = d0;

Disassembling this, gives me:

49          data[0] = d0;
0000f696:   lsrs    r0, r3, #5
0000f698:   eor.w   r0, r0, r3, lsl #4
0000f69c:   add     r0, r3
0000f69e:   ldr.w   r12, [sp, #4]
0000f6a2:   eors    r5, r0
0000f6a4:   subs    r2, r2, r5
0000f6a6:   strd    r2, r3, [r12]
0000f6aa:   add     sp, #12
0000f6ac:   ldmia.w sp!, {r4, r5, r6, r7, r8, r9, r10, r11, pc}
0000f6b0:   ldr     r3, [sp, #576]  ; 0x240
0000f6b2:   b.n     0xfda4 <parseNode+232>

And the offending line is specifically:

0000f6a6:   strd    r2, r3, [r12]

GCC generates code that uses an unaligned memory address with strd, which is not allowed in my architecture.

How can this issue be fixed?
Is this a compiler bug, or the code somehow confuses GCC?
Is there any flag to alter this behavior in GCC?

The aforementioned function belongs to an external library, so I cannot modify it.
However, I prefer a solution that makes GCC produce the correct instructions, instead of modifying the code, as I need to ensure that this bug will actually be fixed, and it is not lurking elsewhere in the code.

1
How do you call the function? Any dirty casts there?Lundin
Perhaps you are calling the function with an unaligned data pointer? (I.e. a pointer not on a 4-byte boundary in this case.)Ian Abbott
Otherwise, do a check like volatile uint32_t my_debug_var = (sum >> 11) & 3; before the loop, then check the value of that one. If it's a ridiculously large value, then you have an array out of bounds bug (which can topple the stack frame etc). Possibly the actual crash happens when you attempt to return from the function.Lundin
The address used here comes from the stack. Assuming this is the data argument, it might be misaligned (Your subset of the disassembly doesn't show if these were initially loaded with an LDRD, which would fail on misalignment, where a plain LDR might succeed if you have alignment checking disabled).Hasturkun

1 Answers

0
votes

The easiest way is to align data to 8bytes.
You should declare the array like:

__attribute__((aligned(8))) uint32_t data[2];