LPC1768 / ARM Cortex-M3 microsecond delay

1

votes

I'm trying to implement a microsecond delay in a bare metal arm environment( LPC1768 ) / GCC. I've seen the examples that use the SysTimer to generate an interrupt that then does some counting in C, which is used as a time base

https://bitbucket.org/jpc/lpc1768/src/dea43fb213ff/main.c

However at 12MHz system clock I don't think that will scale very well to microsecond delays. Basically the processor will spend all it's time servicing the interrupt.

Is it possible to query the value of SYSTICK_GetCurrentValue in a loop and determine how many ticks go in a microsecond and bail out of the loop once the number of ticks exceeds the calculated number?

I'd rather not use a separate hardware timer for this (but will if there is no other choice)

gccarmdelay

How accurate does it need to be? You can just have a loop with nops and calibrate it for your hardware, that is figure out how many cycles per iteration and calculate the number of iterations. – Guy Sirton

Note that 12MHz is (usually) the crystal frequency, it gets multiplied by the clock circuit to get the actual processor clock rate. LPC1768 can run at up to 100MHz. – Igor Skochinsky

4

votes

One way is just to use a loop to create the delay, something like shown below. You need to calibrate your factor. A more general purpose approach is to calculate the factor on startup based on some known timebase.

#define CAL_FACTOR ( 100 )

void delay (uint32_t interval)
{
  uint32_t iterations = interval / CAL_FACTOR;

  for(int i=0; i<iterations; ++i)
  {
    __asm__ volatile // gcc-ish syntax, don't know what compiler is used
    (
      "nop\n\t"
      "nop\n\t"
      :::
    );
  }
}

3

votes

First off interrupts are not needed for this sort of thing, you can poll the timer, no need to overkill with an interrupt. yes reasons why those examples use interrupts, but that doesnt mean that is the only way to use a timer.

Guy Sirton's answer is sound, but I prefer assembler as I can control it exactly to the clock cycle (so long as there are no interrupts or other items that get in the way). A timer is usually easier though as the code is a bit more portable (change the processor clocks frequency and you have to re-tune the loop, with a timer, sometimes all you have to do is change the init code to use a different prescaler, or change the one line looking for the computed count), and allows for interrupts and such things in the system.

In this case though you are talking about 12mhz, and one microsecond, that is 12 instructions yes? Put in 12 nops. Or branch to some assembler with like 10 nops or 8, whatever it comes out to compensate for the pipeline flush on the two branches. A timer and interrupts is going to burn more than 12 instruction cycles in overhead. Even polling the timer in a loop is going to be sloppy. A counter loop would work too, you need to understand the branch costs though and tune for that:

delay_one_ms:
mov r0,#3
wait:
sub r0,#1 @cortex-m3 means thumb/thumb2 and gas complains about subs.
bne wait
nop  @might need some nops to tune the loop accurately
nop
bx lr

Call this function, what 30 million times in a loop using a gpio led or uart output and a stop watch and see that the blinks are 30 seconds apart.

ldr r4,=uart_tx_register_address
mov r5,#0x55
again:
ldr r6,=24000000
str r5,[r4]
top:
bl delay_one_ms
sub r6,#1
bne top
str r5,[r4]
b again

Actually since I assumed 2 clocks per branch, the test loop has 3 clocks, the delay is assumed to be a total 12 clocks so 15 clocks per loop, 30 seconds is 30,000,000 microseconds, ideally 30million loops, but I needed 12/15ths the number of loops to compensate. This is far easier if you have an oscilloscope whose timebase is somewhat accurate, or at least as accurate as you want this delay.

I have not studied ARM's branch costs myself otherwise I would comment on that. It is likely two or three clocks. So the mov is one, the sub is one times the number of loops the bne is lets say two times the number of loops. Two for the branch to get here two for the return. 5+(3*loops)+nops=12. (3*loops)+nops=7 loops is 2 and nops is 1, yes? I think stringing a number of nops together is far easier:

delay_one_ms:
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    bx lr

You might have to burn a few more instructions temporarily disabling interrupts, if you use them. If you are looking for "at least" one microsecond then dont worry about it.

3

votes

You can use SYSTICK inside ARM processor for that. Just program it to count each 1uS, or less if you have enough clock speed, and do-while a loop till your delay value expires. Like this:

void WaitUs(int us) {

    unsigned int cnt; 

    while(us-- >0) {
       cnt = STK_VAL; // get systick counter, ticking each 500nS
       while( (STK_VAL-cnt) < 2); // repeat till 2 ticks
    }
}

Bear in mind that this is an example, you'll need to adjust it for counter roll-over among other things.

LPC1768 / ARM Cortex-M3 microsecond delay

3 Answers