0
votes

A bit of introduction to the architecture

First a bit about target architecture, if anyone finds it useful - Target it STM32H7 series microcontroller - it's ARM Cortex-M7 with a lot of different memory regions located on different busses around the chip, in particular:

  • ITCM memory - 64Kb RAM as fast as the core, used for storing instructions, not accesible by normal DMA
  • DTCM memory - 128Kb RAM as fast as the core, used for storing data, not accessible by normal DMA
  • AXI SRAM - 512Kb RAM on 64-bit bus (AXI) that runs at half CPU clock (but because it's 64-bit it is as fast as core for sequential reads)
  • SRAM 1/2/3 - RAM memories located on AHB bus, that connects through AXI bus - slower to access by CPU, but on the same AHB bus matrix are DMA controllers and most of the peripherals used by the application
  • SRAM4 and Backup RAM - other memories located in another part of chip on different bus matrix, not important here

A special kind of DMA - MDMA - can transfer data between I/DTCM memories and other RAM's in the system.

Example real-world problem

As an example of a problem suppose that I want to create a c++ class that will deal with sending data over serial port. The requirements are:

  • Data to UART is sent using DMA - the application can't spend cycles on using interrupts here.
  • Queue for data to be send out has to be located in DTCM memory, so if I write serial_write(some_data_to_send) it is going to be copied to the queue without wait states (if some_data_to_send is located in DTCM as well)
  • Buffer for DMA should be located in one of SRAM 1/2/3 (so DMA transactions don't use AXI bus), cannot be located in I/DTCM memories
  • Data from queue to DMA buffer should be copied using MDMA

That means that the class needs at least:

  • Buffer that is guaranteed to be in DTCM ram
  • Buffer that is guaranteed to be not in DTCM ram

That means that it cannot be part of the same object, i.e. I can't write something like:

class serial_handler
{
  __attribute__((section(".dtcm"))) std::array<std::uint8_t, 128> m_send_queue;
  __attribute__((section(".sram1"))) std::array<std::uint8_t, 128> m_dma_buffer;
};

Note: It it also not possible to make these variables static - this class might exist in multiple instances handling multiple serial ports

Thus I need to pass the buffer from the outside, i.e.

class serial_handler
{
public:
  using buffer_type = std::array<std::uint8_t, 128>;
private:
  buffer_type &m_send_queue;
  buffer_type &m_dma_buffer;
public:
  /**
   * @param send_queue buffer located in DTCM ram
   * @param dma_buffer buffer located in SRAM 1,2 or 3
   */
  serial_handler(buffer_type &send_queue, buffer_type &dma_buffer);
};

// ...
static __attribute__((section(".dtcm"))) serial_handler::buffer_type s_serial_send_queue;
static __attribute__((section(".sram1"))) serial_handler::buffer_type s_serial_dma_buffer;
static serial_handler serial{s_serial_send_queue, s_serial_dma_buffer};

But now information about location of buffers is encoded in a comment. The serial_handler could (and probably should) assert that passed buffers are in correct address spaces, but that is a runtime check and I would like a compile time error.

Another solution would be to create a memory pool for each section and allocate memory from required region, but again - that is a runtime solution and also that doesn't permit static memory usage analysis.

Abstract problem description

Suppose that the linker has two additional sections - .dtcm and .sram1. I want to create a class with a constructor that takes two statically allocated buffers (pointers, references, other temporary objects holding reference that will be optimized away) - one from each section:

struct example
{
  example(IN_DTCM buffer_type_ref buffer_1, IN_SRAM1 buffer_type_ref buffer_2);
};

I want it in such a way that it will emit a compile time error if provided variable is not in required section. Alternatively the constructor can allocate memory from other objects, but it has be done in such a way that is statically analyzable at link time.

This object will only ever be constructed in a static context (i.e. either a global or static variable in class/function).

EDIT: Clarifying after some comments - the ideal solution would be to create types dtcm_buffer_type and sram1_buffer_type that would always be placed in apropriate sections (i.e. they could only exist in a static context, never on stack which could be in any memory region). I don't believe that such solution is possible, that's why in the example above I used IN_DTCM placeholder.

Final comments

This problem has been taunting me for well over a year now, and I have't found anything that might be even close to meeting the requirements above. I was able to create static allocator using templates (i.e. allocator that lives in a specified memory section but is able to report at link time it's usage), but that is simply not feasible to use in a project over 100,000 lines long.

My current solution is passing buffers from outside and using runtime asserts on memory addresses of provided buffers - but that far from ideal, especially if one forgets to put the assert - that often leads to very hard to diagnose bugs (for example performance problem in completely unrelated part of code due to bus matrix saturation - been there, done that).

The solution can be gcc/g++ specific, but it would be nice if it worked in clang as well.

1
Why not just have different types for objects that serve different purposes? - n. 1.8e9-where's-my-share m.
Because that doesn't solve the problem of required memory placement. - Enbyted
You need to create a type that enforces memory placement. - n. 1.8e9-where's-my-share m.
And how do I go about that? That's the point of this question - it's not like I can put section attribute on type - that wouldn't make sense (consider instantiating the type on the stack - this should not be possible then) - Enbyted
I edited the question to clarify that this would be preferred solution - if possible to execute. - Enbyted

1 Answers

0
votes

Here is a semi-working (with gcc and clang using c++17) implementation. The idea is to force each buffer to have its own type. In this implementation an additional level of indirection is required. Unfortunately we have to rely on macros.

using buffer_type = std::array<std::uint8_t, 128>;
struct buffer {
    virtual buffer_type& send_queue() = 0;
    virtual buffer_type& dma_buffer() = 0;
};

#define DECLARE_BUFFER_TYPE(name) \
struct buffer_impl_ ## name : buffer { \
    static inline __attribute__((section(".dtcm")))  buffer_type s_serial_send_queue; \
    static inline __attribute__((section(".sram1"))) buffer_type s_serial_dma_buffer; \
    buffer_type& send_queue() override { return s_serial_send_queue; } \
    buffer_type& dma_buffer() override { return s_serial_dma_buffer; } \
}; \

#define DECLARE_BUFFER(name) DECLARE_BUFFER_TYPE(name); buffer_impl_ ## name name;

Declare a buffer as follows:

DECLARE_BUFFER(mybuf1);

Each buffer declared this way has its own class and so its own set of static variables. The library uses buffer& as the parameter type.

serial_handler(buffer& b);

It is very hard to pass a wrong type accidentally, though of course it is easy to do so intentionally.