4
votes

Using the GCC compiler for ARM (windows) :

arm-none-eabi-gcc.exe (Sourcery CodeBench Lite 2012.09-63) 4.7.2 version

I have got different object file produced every ~5 times i compiled the same source file.

The optimization level 3 (aggressive) is used, compiler options used:

-O3 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fshort-wchar -fshort-enums -funsafe-math-optimizations -mvectorize-with-neon-quad

The dump of the different object files (using objdump) shows too many differences in assembly instructions , registers and addresses used.

  • Is it normal that compiler optimize/compile exactly the same source file differently and produce different object files ?! is it a compiler bug ?

  • How to avoid this behavior without turning off aggressive optimization ?

EDIT: object files differences snippet:

object_file_dump_A:

0000350 <PreInit>:
 350:   e3003000    movw    r3, #0
 354:   e3403000    movt    r3, #0
 358:   e92d4ff0    push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
 35c:   e1a09000    mov r9, r0
 360:   e24dd034    sub sp, sp, #52 ; 0x34
 /*some identical ASM for both files */
 388:   e1a0700b    mov r7, fp
 38c:   e1a0600b    mov r6, fp
 390:   e300a000    movw    sl, #0
 394:   e340a000    movt    sl, #0
 398:   e5911004    ldr r1, [r1, #4]
 39c:   e8ae0003    stmia   lr!, {r0, r1}

object_file_dump_B:

00000350 <PreInit>:
 350:   e3003000    movw    r3, #0
 354:   e3403000    movt    r3, #0
 358:   e92d4ff0    push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
 35c:   e1a08000    mov r8, r0
 360:   e24dd034    sub sp, sp, #52 ; 0x34
  /*some identical ASM for both files */
 388:   e1a0700b    mov r7, fp
 38c:   e3009000    movw    r9, #0
 390:   e3409000    movt    r9, #0
 394:   e5911004    ldr r1, [r1, #4]
 398:   e8ae0003    stmia   lr!, {r0, r1}
 39c:   e5b30010    ldr r0, [r3, #16]!

EDIT:

source code :

void PreInit(init_T *f_params, results_T *results) 
{
  u8 i, j, k, idx;
  const u8 cr_index[4] = {0, 1, 2, 7};
  const u8 minVal[] = {2, 4, 6, 0, 0, 0, 0, 19}; 
  const u8 maxVal[] = {0, 3, 5, 0, 0, 0, 0, 18}; 

  memset(f_params, 0, sizeof(init_T));

  _ASSERT(CONF_NUM_X_LIMITS == CST_NbSLi);
  _ASSERT(CONF_NUM_CRITERIA == CST_NbIdxCriteria);

  for (i = 0; i < CST_NbSLi; ++i)
  {
    f_params->_sli[i].x = s_limits[i];
    for (j = 0; j < CST_NbIdxCriteria; ++j)
    {
      f_params->_sli[i].criteria[j] = conf_criterias[i][j];
    }
  }
/*some code*/
}
1
Are you sure the code is not including some header file that is being regenerated on each build?jxh
@jxh: yes am sure ! there is now regenerated headers includedAbdurahman
You can try to compile with -E, and see if there's a difference. In case there is, comparing will be much easier than comparing machine code.ugoren
@Abdurahman: There is something seriously fishy going on here. Why is the compiler using movw/movt to load #0 into a register, which can be done by a single instruction? And why is the link register used as a base register for a store-multiple? Can you show us the source of PreInitFuna?unixsmurf
The assembler code is equivalent. Some algorithms are heuristic; often the same code is approximately the same run time if the memory access pattern is the same. It doesn't matter which register is used on the ARM as they are symmetric. So whether it is mov r9, r0 or mov r8, r0 if r9/r8 are treated the same in both object files, the same thing happens. A heuristic maybe guided by a random value.artless noise

1 Answers

0
votes

As mentioned by others the assembly codes are equivalent. If you look at them closely, the command

e1a0600b mov r6, fp

It moves fp to r6, but the r6 register is not used later. So, if we consider the randomization tactics to allocate registers, and code creation, the variations are not big and in the second part the optimization removed the line.