Cortex-M3/M4 Timer Interrupts with ARM Assembly

Question

I am trying to get timer interrupts on a CC26x2 MCU, which uses Cortex-M3/M4. If I understand correctly, then in order to get an interrupt I need to change an entry in the vector table to the address of the interrupt handler, and then when a corresponding event happens, it will automatically go to that address. So I perform the following steps in assembly:

Relocate the vector table to SRAM (by first copying its every entry and then changing the VTOR register)
Write the event handler address to its "GP Timer 0A Interrupt" entry
Configure GP Timer 0A to generate interrupts
Clear the pending interrupts in NVIC_ICPR0
Enable the GP Timer 0A Interrupt in NVIC_ISER0
Enable interrupts with CPSIE I

But then when I run the code, the interrupt handler is never called, even though GP Timer 0A interrupt is shown to be pending in the NVIC registers. Apparently, the corresponding interrupt line is not active (as can be seen in NVIC_IABR0 register). What am I doing wrong?

how are you getting the address to the new handler? it has the lsbit set yes? and what entry in the vector table are you using — old_timer
@old_timer I get the handler address as ADR R0, EventHandler — Mr_Tusk
and there is an open question that lead to an open ticket at bugzilla where ADR has a problem. print out or examine the results of your adr. or use ORR to force the lsbit set or use C to do it, etc.... — old_timer
@old_timer When I look at the handler address in the memory browser, it is the same as the corresponding entry in the vector table - so ADR seems to be working properly (I also set the lsbit of the address to 1 in order to indicate that the handler is Thumb code) — Mr_Tusk
if this is gnu and adr gave you an address without the lsbit set then it is broken BTW per the gnu docmentation, will see what the gnu folks say at the end of the day on this topic. so which vector are you using, what offset/address in the table? your table address has the proper alignment yes you cant put the table anywhere you want. — old_timer

old_timer old_timer · Accepted Answer · 2020-01-03T06:08:30

I banged out an example for you that functions, it is a TI part it is cortex-m4 based, but not the same chip/board you have I don't have that board/chip handy. Doesn't mean the peripherals are the same with your TI part, but the cortex-m4 handling should be the same.

Mine is the MSP432P401R launchpad. As you should know you want the datasheet for the launchpad, you want the datasheet for the MCU, the technical reference manual for the MCU, the ARM cortex-m4 technical reference manual and the ARMv7-m architectural reference manual before you start.

The code below is completely stand-alone all you need to add is a gnu toolchain from the last 10 years or so for ARM. Completely removes any other interference from other code. Every line of code you add adds risk. If you can get it to work at this level then you know you understand the cpu and peripheral enough to move forward, then adding it to a larger project or adding it to something using a library you add risk with that other code, and can at least have a warm fuzzy feeling that you know this peripheral and how you are using it so if things don't work then you either ported the stand-alone experiment wrong or something in the larger program is interfering.

I use openocd to talk to this part, at the time I first got this board however many years ago (can you even get this board any more?) flashing without their sandbox involved me making my own programs to do that. If the (user application) flash is erased then the built in bootloader runs which changes the clocks and other things. So I have programmed the flash to have basically an infinite loop program it turns off the WDT and sits in an infinite loop. So now I can do development in sram using openocd, quite easily

reset halt load_image notmain.sram.elf resume 0x01000000

and repeat those three lines each time I want to try another experiment.

I tend to start with an led blinker, use the systick or a timer with the led to determine/confirm the internal clock rate, then move on to the uart, where I have a simple routine that prints hex numbers its a dozen or so lines of code, not horribly massive like printf, does everything I need. When diving into interrupts which no matter how many decades of experience you have are an advanced topic. Ideally you need a way to visualize what is going on. LED in a pinch, but uart is far better. You want to start with the peripheral standalone if possible, polling. In this case I am using TIMER32 number 1. TI's style is to have the memory space addresses in the datasheet then how to use them in the reference manual. TI has both a raw interrupt status register and a masked interrupt status register.

Starting with the mask disabled learn the timer and the interrupt and how to clear it polling the RIS register.

Once you have mastered that then enable the interrupt, insuring that you have not enabled it into the core of the processor in any way, and see both the masked interrupt status in my case as well as bit 22 in ICSR ISRPENDING gets set. Confirming that you have enabled the interrupt into the ARM core from the chip vendors logic.

TI's style is to also have the interrupt table list in the datasheet. For the timer I am using I see:

INTISR[25] Timer32_INT1

So next I spam the NVIC_ISER0, turning all the bits on (this is a targetted test, nothing else should be going on in the chip). I have executed cpsid I to keep the interrupts out of the core.

Then I examine the ICSR after the interrupt and in my case the VECTPENDING field is 0x29 or 41 which is 16+15. That matches the datasheet. If I now change NVID_ISER0 to 1<<25 only and repeat, same answer VECTPENDING is 0x29. Can now move forward.

Here is where you have choices and have to master your tools. I went ahead and skipped using the power on VTOR of 0x00000000 and the vector table in flash and moved to sram which is your desire and also that is how I am developing. First from the arm documentation you see that VTOR has to be aligned. I went ahead and set it to the beginning of sram 0x01000000, and setup my entry code (sram style not flash style) to resemble a vector table but without the stack pointer init value, that takes us into the example:

sram.s

.thumb

.thumb_func
.global _start
_start:
b reset
nop
.word loop /*0x0004 1    Reset                   */
.word loop /*0x0008 2    NMI                     */
.word loop /*0x000C 3    HardFault               */
.word loop /*0x0010 4    MemManage               */
.word loop /*0x0014 5    BusFault                */
.word loop /*0x0018 6    UsageFault              */
.word loop /*0x001C 7    Reserved                */
.word loop /*0x0020 8    Reserved                */
.word loop /*0x0024 9    Reserved                */
.word loop /*0x0028 10   Reserved                */
.word loop /*0x002C 11   SVCall                  */
.word loop /*0x0030 12   DebugMonitor            */
.word loop /*0x0034 13   Reserved                */
.word loop /*0x0038 14   PendSV                  */
.word loop /*0x003C 15   SysTick                 */
.word loop /*0x0040 16   External interrupt  0   */
.word loop /*0x0044 17   External interrupt  1   */
.word loop /*0x0048 18   External interrupt  2   */
.word loop /*0x004C 19   External interrupt  3   */
.word loop /*0x0050 20   External interrupt  4   */
.word loop /*0x0054 21   External interrupt  5   */
.word loop /*0x0058 22   External interrupt  6   */
.word loop /*0x005C 23   External interrupt  7   */
.word loop /*0x0060 24   External interrupt  8   */
.word loop /*0x0064 25   External interrupt  9   */
.word loop /*0x0068 26   External interrupt 10   */
.word loop /*0x006C 27   External interrupt 11   */
.word loop /*0x0070 28   External interrupt 12   */
.word loop /*0x0074 29   External interrupt 13   */
.word loop /*0x0078 30   External interrupt 14   */
.word loop /*0x007C 31   External interrupt 15   */
.word loop /*0x0080 32   External interrupt 16   */
.word loop /*0x0084 33   External interrupt 17   */
.word loop /*0x0088 34   External interrupt 18   */
.word loop /*0x008C 35   External interrupt 19   */
.word loop /*0x0090 36   External interrupt 20   */
.word loop /*0x0094 37   External interrupt 21   */
.word loop /*0x0098 38   External interrupt 22   */
.word loop /*0x009C 39   External interrupt 23   */
.word loop /*0x00A0 40   External interrupt 24   */
.word timer32_handler /*0x00A4 41   External interrupt 25   */
.word loop /*0x00A8 42   External interrupt 26   */
.word loop /*0x00AC 43   External interrupt 27   */
.word loop /*0x00B0 44   External interrupt 28   */
.word loop /*0x00B4 45   External interrupt 29   */
.word loop /*0x00B8 46   External interrupt 30   */
.word loop /*0x00BC 47   External interrupt 31   */
.word loop /*0x00C0 48   External interrupt 32   */

reset:
    cpsid i
    ldr r0,stacktop
    mov sp,r0
    bl notmain
    b loop
.thumb_func
loop:   b .

.align
stacktop: .word 0x20008000

.thumb_func
.globl ienable
ienable:
    cpsie i
    bx lr

.thumb_func
.globl PUT8
PUT8:
    strb r1,[r0]
    bx lr

.thumb_func
.globl GET8
GET8:
    ldrb r0,[r0]
    bx lr

.thumb_func
.globl PUT16
PUT16:
    strh r1,[r0]
    bx lr

.thumb_func
.globl GET16
GET16:
    ldrh r0,[r0]
    bx lr

.thumb_func
.globl PUT32
PUT32:
    str r1,[r0]
    bx lr

.thumb_func
.globl GET32
GET32:
    ldr r0,[r0]
    bx lr

.thumb_func
.globl get_addr
get_addr:
    ldr r0,=timer32_handler
    bx lr

Your title question said assembly I am using mixed C/asm to make it easier to read/use. you can certainly do yours all in asm if you like, mine is not meant to be a library but a reference to see if you are doing the same things.

notmain.c

void PUT32 ( unsigned int, unsigned int );
unsigned int GET32 ( unsigned int );
void PUT8 ( unsigned int, unsigned int );
unsigned int GET8 ( unsigned int );
void PUT16 ( unsigned int, unsigned int );
unsigned int GET16 ( unsigned int );
void ienable ( void );
#define PORT_BASE       0x40004C00
#define PAOUT_L         (PORT_BASE+0x02)
#define PADIR_L         (PORT_BASE+0x04)
#define WDTCTL          0x4000480C
#define TIMER32_BASE    0x4000C000
#define ICSR            0xE000ED04
#define SCR             0xE000ED10
#define VTOR            0xE000ED08
#define NVIC_ISER0      0xE000E100
#define NVIC_IABR0      0xE000E300
#define NVIC_ICPR0      0xE000E280
volatile unsigned int ticks;
void timer32_handler ( void )
{
    ticks^=1;
    PUT8(PAOUT_L,ticks);
    PUT32(TIMER32_BASE+0x0C,0);
    PUT32(NVIC_ICPR0,1<<25);
}
void notmain ( void )
{
    PUT16(WDTCTL,0x5A84);
    PUT8(PADIR_L,GET8(PADIR_L)|0x01);
    ticks=0;
    PUT32(VTOR,0x01000000);
    PUT32(NVIC_ISER0,1<<25);
    ienable();
    PUT32(TIMER32_BASE+0x08,0xA4);
}

sram.ld

MEMORY
{
    ram : ORIGIN = 0x01000000, LENGTH = 0x3000
}
SECTIONS
{
    .text : { *(.text*) } > ram
    .rodata : { *(.rodata*) } > ram
    .bss : { *(.bss*) } > ram
}

and that's it 100% of the source code for this example, all you need to do is build it:

arm-none-eabi-as --warn sram.s -o sram.o
arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding  -mcpu=cortex-m4 -mthumb  -c notmain.c -o notmain.o
arm-none-eabi-ld      -T sram.ld sram.o notmain.o -o notmain.sram.elf
arm-none-eabi-objdump -D notmain.sram.elf > notmain.sram.list
arm-none-eabi-objcopy notmain.sram.elf notmain.sram.bin -O binary

Any of the gnu gcc/binutils cross compilers from the last decade or so should work, the arm-none-eabi style as well as the arm-whatever-linux style, this code isnt affected by the difference.

The architectural reference manual shows that the first entry in the vector table is the stack pointer initialization value you can choose to use that or not, but it is offset 0x0000. Then the exceptions start exception 1 is reset, 2 is NMI and so on. exception 16 is where external (to the arm core) interrupt 0 starts and down the line, so interrupt 25 lands here

.word timer32_handler /*0x00A4 41 External interrupt 25 */

at offset 0xA4 in the vector table. If you are desperate or the chip isnt well documented then either between the pending status or simply spamming the vector table with all entries pointing at the handler you can narrow in on the offset/interrupt number. (light an led or something when the interrupt comes then go into an infinite loop, a horrible handler for real world stuff but just fine for reverse engineering a poorly documented part).

before you execute anything confirm you built things right, the entry point should be the code you expect, in this case being sram I have the entry point as instructions (that jump over my soon to be vector table when I change VTOR)

Disassembly of section .text:

01000000 <_start>:
 1000000:   e060        b.n 10000c4 <reset>
 1000002:   46c0        nop         ; (mov r8, r8)
 1000004:   010000d1    ldrdeq  r0, [r0, -r1]
 1000008:   010000d1    ldrdeq  r0, [r0, -r1]
 100000c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000010:   010000d1    ldrdeq  r0, [r0, -r1]
 1000014:   010000d1    ldrdeq  r0, [r0, -r1]
 1000018:   010000d1    ldrdeq  r0, [r0, -r1]
 100001c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000020:   010000d1    ldrdeq  r0, [r0, -r1]
 1000024:   010000d1    ldrdeq  r0, [r0, -r1]
 1000028:   010000d1    ldrdeq  r0, [r0, -r1]
 100002c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000030:   010000d1    ldrdeq  r0, [r0, -r1]
 1000034:   010000d1    ldrdeq  r0, [r0, -r1]
 1000038:   010000d1    ldrdeq  r0, [r0, -r1]
 100003c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000040:   010000d1    ldrdeq  r0, [r0, -r1]
 1000044:   010000d1    ldrdeq  r0, [r0, -r1]
 1000048:   010000d1    ldrdeq  r0, [r0, -r1]
 100004c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000050:   010000d1    ldrdeq  r0, [r0, -r1]
 1000054:   010000d1    ldrdeq  r0, [r0, -r1]
 1000058:   010000d1    ldrdeq  r0, [r0, -r1]
 100005c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000060:   010000d1    ldrdeq  r0, [r0, -r1]
 1000064:   010000d1    ldrdeq  r0, [r0, -r1]
 1000068:   010000d1    ldrdeq  r0, [r0, -r1]
 100006c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000070:   010000d1    ldrdeq  r0, [r0, -r1]
 1000074:   010000d1    ldrdeq  r0, [r0, -r1]
 1000078:   010000d1    ldrdeq  r0, [r0, -r1]
 100007c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000080:   010000d1    ldrdeq  r0, [r0, -r1]
 1000084:   010000d1    ldrdeq  r0, [r0, -r1]
 1000088:   010000d1    ldrdeq  r0, [r0, -r1]
 100008c:   010000d1    ldrdeq  r0, [r0, -r1]
 1000090:   010000d1    ldrdeq  r0, [r0, -r1]
 1000094:   010000d1    ldrdeq  r0, [r0, -r1]
 1000098:   010000d1    ldrdeq  r0, [r0, -r1]
 100009c:   010000d1    ldrdeq  r0, [r0, -r1]
 10000a0:   010000d1    ldrdeq  r0, [r0, -r1]
 10000a4:   010000fd    strdeq  r0, [r0, -sp]
 10000a8:   010000d1    ldrdeq  r0, [r0, -r1]
 10000ac:   010000d1    ldrdeq  r0, [r0, -r1]
 10000b0:   010000d1    ldrdeq  r0, [r0, -r1]
 10000b4:   010000d1    ldrdeq  r0, [r0, -r1]
 10000b8:   010000d1    ldrdeq  r0, [r0, -r1]
 10000bc:   010000d1    ldrdeq  r0, [r0, -r1]
 10000c0:   010000d1    ldrdeq  r0, [r0, -r1]

010000c4 <reset>:
 10000c4:   b672        cpsid   i
 10000c6:   4803        ldr r0, [pc, #12]   ; (10000d4 <stacktop>)
 10000c8:   4685        mov sp, r0
 10000ca:   f000 f835   bl  1000138 <notmain>
 10000ce:   e7ff        b.n 10000d0 <loop>

010000d0 <loop>:
 10000d0:   e7fe        b.n 10000d0 <loop>

all the entries are the address of the handler ORRed with 1 as required.

In gnu assembler notice to get loop to work properly you need to preceed the lable with .thumb_func to tell the tool the next label is a function (so set the lsbit when I ask for its address)

.thumb_func
loop:   b .

Without the .thumb_func there the address would be wrong and the handler would not get called another exception would happen again and if that handler address is wrong it is really game over.

If you want to manually build the table, understand that at the time this answer was written there is a pending bug at gnu showing that ADR does not work right, it is a pseudo instruction and poorly documented in the architectural reference manual so it is up to the assembler which defines the assembly language (assembly is defined by the tool, not the target nor architecture, the machine code is defined by the architecture, assembly language is a free for all). In the case of gnu assembler the documentation claims that when interwork is set it will provide an address with the lsbit set so that a bx rd can be used, but that is false for foreward referenced labels. Other assemblers may use ADR however they wish and you should check their definition. When in doubt ORR the lsbit if you feel the need to use ADR (don't add, or), I would certainly avoid the instruction all together, for example:

.thumb_func
.globl get_addr
get_addr:
    ldr r0,=timer32_handler
    bx lr

010000f4 <get_addr>:
 10000f4:   4800        ldr r0, [pc, #0]    ; (10000f8 <get_addr+0x4>)
 10000f6:   4770        bx  lr
 10000f8:   010000fd    strdeq  r0, [r0, -sp]

which worked great (note this is disassembly, the strdeq is just the disassembler trying to make sense of the value 010000fd which is what you should focus on, the tools did the work for me providing the address in the correct form that I needed. Still relying on the tools and knowing/hoping they work but using something that has/does work with at least gas/binutils.

Notice for safety my boot strap starts by disabling interrupts. sets up the stack pointer and launches the C entry point. Since I have no .data nor require .bss to be zeroed the linker script and bootstrap are that trivial. I have multiple reasons for abstracting read/write access, you can do it your way (be careful that the popular ways are not necessarily C compliant and expect those habits/FADs to fail some day).

For these parts (TI in general it seems) early on you want disable the watch dog timer otherwise it resetting the part will drive you crazy trying to figure out what is going on.

My board has an led on it I set that port pin to be an output.

I have a variable that I use to keep track of interrupts so I can blink the led on/off every interrupt.

Since I let the tools do the work I set the VTOR to the beginning of sram, which is a properly aligned address.

I enable the interrupt in the NVIC

I enable interrupts to the core

I setup the peripheral and enable its interrupts.

Since I wrote the bootstrap and know it simply lands in an infinite loop when the C entry point function returns I can just return and leave the processor in that infinite loop waiting for interrupts and the interrupt handler to do the rest of the work.

In the handler I start from the peripheral toward the core, YMMV if you do it the other way, clearing the interrupt (after toggling the led).

That's it. Sounds like you are doing these steps, but since you have not provided the information required to see what you are really doing can only guess as to what step is missing or has the wrong value or is in the wrong place.

I cant emphasize enough that where possible in any chip/processor use polling as much as you can to experiment using targetted tests to figure out the peripheral and follow the interrupt through however many layers of interrupt gates there are, only enabling interrupts into the core after you have mastered as much as possible without actually causing the processor to interrupt. Doing it all at once makes the development take many times longer on average and is often significantly more painful.

My hope is this long answer triggers a simple three second fix to your code, if not you can at least try to develop from it a test for your chip. I have not posted the uart enabled version I used to discover how this part worked, but using that path it was pretty easy to figure the peripheral out, then walk the interrupt toward the core, have everything ready to create and clear interrupts then lastly enable the interrupt into the core and it worked first time (a bit of luck there, doesn't always happen that way).

EDIT

But if I do not reallocate the vector table into SRAM, how to I route the corresponding interrupt to its handler?

You simply add the label to the vector table

    .thumb
    .thumb_func
    .global _start
    _start:
    stacktop: .word 0x20001000
    .word reset
    .word hello
    .word world
    
    .thumb_func
    reset: b .
    
    .thumb_func
    hello: b .
    
    .thumb_func
    world: b .


arm-none-eabi-as flash.s -o flash.o
arm-none-eabi-ld -Ttext=0 flash.o -o flash.elf
arm-none-eabi-objdump -D flash.elf 

flash.elf:     file format elf32-littlearm


Disassembly of section .text:

00000000 <_start>:
   0:   20001000    andcs   r1, r0, r0
   4:   00000011    andeq   r0, r0, r1, lsl r0
   8:   00000013    andeq   r0, r0, r3, lsl r0
   c:   00000015    andeq   r0, r0, r5, lsl r0

00000010 <reset>:
  10:   e7fe        b.n 10 <reset>

00000012 <hello>:
  12:   e7fe        b.n 12 <hello>

00000014 <world>:
  14:   e7fe        b.n 14 <world>

No need to copy and modify the vector table everything is in place in flash.

I have to wonder why you don't know what your handlers are at build time and have to add things at runtime, this is an MCU. Maybe you have a generic bootloader? But in that case you wouldn't need to preserve any of the prior handlers. If you must move the table to sram and add an entry at runtime that is fine but you have to ensure that 1) VTOR is supported by the core and implementation of that core you are using 2) your entry is correct per the rules for this architecture.

Get either of those wrong and it wont work. Then of course there is the peripheral setup, the enabling of interrupts through gates to the core, enabling through the core to the processor and handling clearing the interrupt in the handler so it doesn't fire infinitely.

Cortex-M3/M4 Timer Interrupts with ARM Assembly

1 Answers