0
votes

I want to divide 64 bit number by 32 bit number in ARM cortex M3 device using ARM inline assembler.

I tried dividing 32 bit number by 32 bit number, its working fine. I shared the code also. Please let me know what changes or what new things has to be added so that i can do 64 bit division.

long res = 0;
long Divide(long i,long j)
{
asm ("sdiv  %0,%[input_i], %[input_j];"
    : "=r" (res)
    : [input_i] "r" (i), [input_j] "r" (j)
     );
    return res;
 }
1
Why inline asm at all? int32_t div64_32(int64_t x, int32_t y) { return x/y; } and let the compiler call a helper function that uses multiple division instructions for extended-precision division. (I think Cortex-M3 doesn't have 64/64 or 64/32 in 1 instruction).Peter Cordes
Also see what the compiler generates: godbolt.org/z/j2JGR9. It converts the 32 bit value into a 64 bit value and calls a library function (which is likely hand-tuned).Codo
Our controller doesn't have hardware FPU (Floating Point Unit). So it doesn't recognizes floating point numbers, so we doing all calculations in 24 bit fixed point format. So all division and multiplications will exceeds 32 bit. Also as per company standards we should not use any standard compiler libraries. so let me know if any possibilities so that i can perform 64 bit division.Manjunath
Nothing anyone suggested in comments involves floating point. IDK why you even mention that. If you get a linker error, it's because you forgot to include libgcc.a when linking; it has the implementation of GCC's helper functions.Peter Cordes
What compiler are you using? Your constraints sound really strange to me. I can make ATSAMG55 (Cortex-M4 w/o FPU) divide 64bit/32bit and IAR makes internal call to __aeabi_uldivmod. Stating "should not use any standard compiler libraries" can have only one meaning: DO NOT USE C/C++, write everything in assembler (which is stupid).firda

1 Answers

0
votes

Cortex-M ISA currently doesn't support 64bit integer division.

You'll have to program it.

The following is an example I just writed down. Probably it wastly inefficient and buggy.

 .syntax unified
    .cpu cortex-m3
    .fpu softvfp
    .thumb

.global div64

    .section .text.div64
    .type div64, %function

div64:
  cbz   r1, normal_divu
  stm   sp!, {r4-r7}
  mov   r6, #0
  mov   r7, #32
rot_init:
  cbz   r7, exit
#evaluate free space on left of higher word
  clz   r3, r1
#limit to free digits
  cmp   r7, r3
  it    pl
  bpl   no_limit
  mov   r3, r7
no_limit:
#update free digits
  sub   r7, r3
#shift upper word r3 times
  lsl   r1, r3
#evaluate right shift for masking upper bits
  rsb   r4, r3, #32
#mask higher bits of lower word
  mov   r4, r0, LSR r4
#add them to higher word
  add   r1, r4
#shift lower word r3 times
  lsl   r0, r3
#divide higher word
  udiv  r5, r1, r2
#put the remainder in higher word
  mul   r4, r5, r2
  sub   r1, r4
#add result bits
  lsl   r6, r3
  add   r6, r5
  b     rot_init
exit:
  mov   r0, r6
  ldm   sp!, {r4-r7}
  bx    lr

normal_divu:
  udiv  r0, r2
  bx    lr