3
votes

So I am using bare metal malloc in combination with a self-written _sbrk. I am running everything on the Stellaris Launchpad. This board contains a cortex M4. It also contains 32K RAM, starting at memaddr 0x20000000 and running until 0x20007FFF. At the very start of the program, the situation is as follows:

  • _ebss is the start of the assignable memory, everything with a lower address then _ebss is consts, program instructions, etc. It is defined in the linker script. The value at start is 0x200008b0
  • At the loading phase of the software the main stack pointer is set to 0x20007FFF, and the stack grows up. This means that, at the very start, there is about 30,5K RAM free for assignment.

_sbrk wants to hand out this RAM and acknoledges the correct amount of free RAM: after the initialization phase (which means that the stack has grown a bit, so the amount of free RAM is lower), the following mem printouts exist:

(S) MSP: 0x20007f4c, heap_end 0x200008b0, diff 30364
(F) MSP: 0x20007f4c, heap_end 0x200008e8, diff 30308
(S) MSP: 0x20007f4c, heap_end 0x200008e8, diff 30308
(F) MSP: 0x20007f4c, heap_end 0x20000fc8, diff 28548
arena: 1816
ordblks: 1
smblks: 0
hblks: 0
hblkhd 0
usmblks: 0
fsmblks: 0
uordblks: -1056
fordblks: 2872
keepcost: 2872

The first four lines are from _sbrk, the last two from mallinfo(). (S) means at the start of _sbrk (before the heap pointer is moved) and (F) means after the heap pointer has been moved. If I can read the documentation of mallinfo correctly (this is the tricky part im afraid) then arena means the amount of heap requested from _sbrk. This would make sense, because the difference between the very first printed heap_end and the very last printed heap_end is indeed 1816. And fordblks should mean the total amount of heap free, so the maximum amount of heap that can still be requested from _sbrk. And that is incorrect. As you can see from the last diff the difference between heap_end and MSP is 28K. Offcourse we want to keep a buffer between the two so that the MSP can grow without corrupting everything, but we want to give out more then 2.8K RAM has heap.

This becomes a problem when I let the program run a bit longer: eventually fordblks will approach zero and malloc will start to return NULL, while the heap_end has not reached the MSP by a long shot. So malloc is refusing to give out more mem way before it needs to. How do I fix this behviour? What is the fordblks value based on?

(S) MSP: 0x20007f34, heap_end 0x20000fc8, diff 28524
(F) MSP: 0x20007f34, heap_end 0x20001fc8, diff 24428
(S) MSP: 0x20007f34, heap_end 0x20001fc8, diff 24428
(F) MSP: 0x20007f34, heap_end 0x20002000, diff 24372
Could not create process timeoutProc30s
Test prepared
arena: 8304
ordblks: 2
smblks: 0
hblks: 0
hblkhd 0
usmblks: 0
fsmblks: 0
uordblks: 7648
fordblks: 656
keepcost: 112

Edit: More info! Above you see the actual moment where Malloc tells no: its at the Could not create process.. line. As you can clearly see, just above Malloc did the last attempt at aquiring heap from _sbrk, and _sbrk obliged (heap pointer changed). There is also 24K of empty RAM left. I tried to malloc 1024 bytes of RAM when malloc told me NULL.

Edit: The linker_script.ld file:

_stack_buffer = 128; /*128 byte buffer between HEAP and STACK*/

MEMORY
{
    FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 0x00040000
    SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}

SECTIONS
{
    .text :
    {
        _text = .;
        KEEP(*(.isr_vector))
        *(.text*)
        *(.rodata*)
        _etext = .;
    } > FLASH

    .data : /*AT(ADDR(.text) + SIZEOF(.text))*/
    {
        _data = .;
        *(vtable)
        *(.data*)
        _edata = .;
    } > SRAM AT > FLASH

    .bss : AT (ADDR(.data) + SIZEOF(.data))
    {
        _bss = .;
        *(.bss*)
        *(COMMON)
        _ebss = .;
        _end = .;
    } > SRAM

    _stack_top = ORIGIN(SRAM) + LENGTH(SRAM) - 1; /*The starting point of the stack, at the very bottom of the RAM*/
}

Edit: I did some more research on the subject and I found some intresting stuff. First of all: fordblks is unrelated to the actual empty heap and I do not know what it is based on. Because if you malloc a lot of 100 bytes in a while true loop the mallocing will continue until _sbrk returns -1, which is expected behaviour. There are certain circumstances where malloc will return NULL without the heap actually being filled. One example Is where a malloc(1024) returns NULL, but five malloc(555) are allowed. So it seems to be related to the internals of Malloc somehow.

Disclaimer: last time I asked a bare metal question I was scoffed for being arrogant: I am not saying that newlib Malloc is doing anything wrong and I suspect I need to define something in the linker script or something to fix this. I am aware of the fact that all of this is my fault and I am here to ask why this problem is here and how I need to fix my code to fix this behaviour. Not to say that the guys at Newlib have no clue what they are doing.

1
I don't know newlib, but memory manager blocks are not normally 1 byte as you seem to assume. It's also possible the malloc is using space for internal bookkeeping. You'd have to look at the source to tell.Gene
But the malloc documentation says: int fordblks; /* Total free space (bytes) */, so I assume it is still about bytes. And even with bookkeeping (I would assume it is included in the arena var), the space is not requested from _sbrk.Cheiron
Wouldn't it be easier to analyze if the printing printed hex instead of decimal. I grant you that 536903460 maps to 0x20007F24 which is in the range you stated, but it would be a whole heap easier to see if the printout was in hex. Also, if the stack starts at 0x20007FFF, surely the stack grows downwards towards 0x20000000, not upwards?Jonathan Leffler
You are right, I will add change the question. And indeed the stack grows towards 0x200*, and I find it hard to remember if thats up or down (It grows from a higher numer to a lower number, so to me it feels like down). I will also update that.Cheiron
Can you include your linker script in the question?user149341

1 Answers

2
votes

Cheiron,

  1. Usually if system has low amount of memory or high load any dynamic allocation is avoided as much as possible. I'd recommend to look at techniques like memory pools.
  2. Memory managers always have fragmentation problem on long running system. Blocks have alignments. For example I have seen memory managers which allocate at least 48 bytes for anything. And no memory manager can fairly account your subject area and in turn memory usage pattern.

So my recommendation is to avoid dynamic memory allocation and use pools of objects specific to your requirements. It is almost 100% case for embedded systems where almost everything has 'pools', 'cyrcular buffers' and so on. Hope this advice will help you in some way.