3
votes

So working with C in the arm-none-eabi-gcc. I have been having an issue with pointers, they don't seem to exists. Perhaps I'm passing the wrong cmds to the compiler.

Here is an example.

    unsigned int * gpuPointer = GetGPU_Pointer(framebufferAddress);
    unsigned int color = 16;
    int y = 768;
    int x = 1024;

    while(y >= 0)
    {
        while(x >= 0)
        {
            *gpuPointer = color;
            color = color + 2;
            x--;
        }

        color++;
        y--;
        x = 1024;
    }

and the output from the disassembler.

81c8:   ebffffc3    bl  80dc <GetGPU_Pointer>
81cc:   e3a0c010    mov ip, #16 ; 0x10
81d0:   e28c3b02    add r3, ip, #2048   ; 0x800
81d4:   e2833002    add r3, r3, #2  ; 0x2
81d8:   e1a03803    lsl r3, r3, #16
81dc:   e1a01823    lsr r1, r3, #16
81e0:   e1a0300c    mov r3, ip
81e4:   e1a02003    mov r2, r3
81e8:   e2833002    add r3, r3, #2  ; 0x2
81ec:   e1a03803    lsl r3, r3, #16
81f0:   e1a03823    lsr r3, r3, #16
81f4:   e1530001    cmp r3, r1
81f8:   1afffff9    bne 81e4 <setup_framebuffer+0x5c>

Shouldn't there be a str cmd around 81e4? To add further the GetGPU_Pointer is coming from an assembler file but there is a declaration as so.

extern unsigned int * GetGPU_Pointer(unsigned int framebufferAddress);

My gut feeling is its something absurdly simple but I'm missing it.

2

2 Answers

7
votes

You never change the value of gpuPointer and you haven't declared it to point to a volatile. So from the compiler's perspective you are overwriting a single memory location (*gpuPointer) 768*1024 times, but since you never use the value you are writing into it, the compiler is entitled to optimize by doing a single write at the end of the loop.

4
votes

Adding to rici's answer (upvote rici not me)...

It gets even better, taking what you offered and wrapping it

extern unsigned int * GetGPU_Pointer ( unsigned int );
void fun ( unsigned int framebufferAddress )
{
    unsigned int * gpuPointer = GetGPU_Pointer(framebufferAddress);
    unsigned int color = 16;
    int y = 768;
    int x = 1024;

    while(y >= 0)
    {
        while(x >= 0)
        {
            *gpuPointer = color;
            color = color + 2;
            x--;
        }

        color++;
        y--;
        x = 1024;
    }

}

Optimizes to

00000000 <fun>:
   0:   e92d4008    push    {r3, lr}
   4:   ebfffffe    bl  0 <GetGPU_Pointer>
   8:   e59f3008    ldr r3, [pc, #8]    ; 18 <fun+0x18>
   c:   e5803000    str r3, [r0]
  10:   e8bd4008    pop {r3, lr}
  14:   e12fff1e    bx  lr
  18:   00181110    andseq  r1, r8, r0, lsl r1

because the code really doesnt do anything but that one store.

Now if you were to modify the pointer

while(x >= 0)
{
    *gpuPointer = color;
    gpuPointer++;
    color = color + 2;
    x--;
}

then you get the store you were looking for

00000000 <fun>:
   0:   e92d4010    push    {r4, lr}
   4:   ebfffffe    bl  0 <GetGPU_Pointer>
   8:   e59f403c    ldr r4, [pc, #60]   ; 4c <fun+0x4c>
   c:   e1a02000    mov r2, r0
  10:   e3a0c010    mov ip, #16
  14:   e2820a01    add r0, r2, #4096   ; 0x1000
  18:   e2801004    add r1, r0, #4
  1c:   e1a0300c    mov r3, ip
  20:   e4823004    str r3, [r2], #4
  24:   e1520001    cmp r2, r1
  28:   e2833002    add r3, r3, #2
  2c:   1afffffb    bne 20 <fun+0x20>
  30:   e28ccb02    add ip, ip, #2048   ; 0x800
  34:   e28cc003    add ip, ip, #3
  38:   e15c0004    cmp ip, r4
  3c:   e2802004    add r2, r0, #4
  40:   1afffff3    bne 14 <fun+0x14>
  44:   e8bd4010    pop {r4, lr}
  48:   e12fff1e    bx  lr
  4c:   00181113    andseq  r1, r8, r3, lsl r1

or if you make it volatile (and then dont have to modify it)

volatile unsigned int * gpuPointer = GetGPU_Pointer(framebufferAddress);

then

00000000 <fun>:
   0:   e92d4008    push    {r3, lr}
   4:   ebfffffe    bl  0 <GetGPU_Pointer>
   8:   e59fc02c    ldr ip, [pc, #44]   ; 3c <fun+0x3c>
   c:   e3a03010    mov r3, #16
  10:   e2831b02    add r1, r3, #2048   ; 0x800
  14:   e2812002    add r2, r1, #2
  18:   e5803000    str r3, [r0]
  1c:   e2833002    add r3, r3, #2
  20:   e1530002    cmp r3, r2
  24:   1afffffb    bne 18 <fun+0x18>
  28:   e2813003    add r3, r1, #3
  2c:   e153000c    cmp r3, ip
  30:   1afffff6    bne 10 <fun+0x10>
  34:   e8bd4008    pop {r3, lr}
  38:   e12fff1e    bx  lr
  3c:   00181113    andseq  r1, r8, r3, lsl r1

then you get your store

arm-none-eabi-gcc -O2 -c a.c -o a.o
arm-none-eabi-objdump -D a.o
arm-none-eabi-gcc (GCC) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The problem is, as written, you didnt tell the compiler to update the pointer more than the one time. So as in my first example it has no reason to even implement the loop, it can pre-compute the answer and write it one time. In order to force the compiler to implement the loop and write to the pointer more than one time, you either need to make it volatile and/or modify it, depends on what you were really needing to do.