ST has some application notes that talk about emulating a parallel bus using DMA to GPIO. I appreciate that, but it doesn't answer important questions. I am looking through the reference manual, and I can't seem to find clarify the things that I am concerned about.
I am most concerned about the jitter. The reference manual repeatedly states, that when DMA is triggered (e.g., by a timer), the DMA controller will read the memory and transfer the value to the peripheral. That might be fine with peripherals that have their own FIFO. There, when space is available in the FIFO, DMA is triggered and fills the FIFO. That will probably happen before the FIFO runs empty.
But with GPIO, if the DMA channels doesn't have a FIFO itself, the data will not be ready when the timer triggers and it needs to be fetched from SRAM. So between the timer triggering and between the value actually arriving in the GPIO output register, some time may pass. This might be measurable when looking at the clock output by the timer and the GPIO pins. The DMA controller has to compete for access to the SRAM with the running program, so certain activities by the program may increase the jitter.
Maybe that is a colossal oversight on my part, but ST's reference manual doesn't seem mention a FIFO as part of the DMA. If that is the case, that would result in jitter which may impact performance at higher frequencies.
I need to toggle 3 to 4 pins synchronously to a clock from 100kHz to 1MHz. I am considering DMA to GPIO and also abusing a QuadSPI controller. I am currently testing on a STM32L4 but I'm also considering STM32F4 or even F1.