I recommend reading RPIO source code together with ServoBlaster's, as it's slightly simplified and can help understanding. Also very important: Broadcom's BCM2835 manual which contains all the tiny details.
is there a diagram for the blocks
The manual contains all the functionalities offered by the chip (not in a block diagram though, as far as I’ve seen).
Is the following understanding right:
The DMA controller is part of the main chip (Broadcom, although I think the same happens on desktop CPUs). It can't exactly run code, but it can copy memory across peripherals by itself, without consuming the main processor’s time. The DMA controller has different channels which can copy memory independently and runs independently of the CPU.
It is configurable via "control blocks" (BCM manual page 40, 4.2.1.1): you can tell the DMA controller to first copy memory from A to B, then from C to D and so on.
don't understand how exactly DMA and PWM work together
DMA is used to send data to the PWM controller ("Pulse Width Modulator", BCM manual page 138, chap. 9), which consumes the data and this creates a very precise delay. Interestingly, the PWM controller is... not used to generate any PWM pulse, but just to wait.
Can someone explain how that works?
Ultimately, you configure the value of the GPIO pins (or the settings of the PWM or PCM generator), by setting memory at a special address; the memory in that region represents the peripheral configuration (BCM manual page 89, chapter 6).
So the idea is: copy 1 onto the memory that controls the GPIO pin value, using the DMA controller; wait the pulse width; copy 0 onto the GPIO pin value; wait the remaining part of the period; loop. Since the DMA controller does it, it doesn't consume CPU cycles.
The key point here is being able to make the DMA controller "wait" an exact amount of time, and for this, RPIO and ServoBlaster use the PWM controller in FIFO mode (the PCM generator also has such functionality, but let's stick to PWM). This means that the PWM controller will "send" the data it reads from its so-called FIFO queue, and then stop. It doesn't matter how it's "sent" (BCM manual page 139, 9.4 MSENi=0
), the key point is that it requires a fixed amount of time. As a matter of fact, it doesn't even matter which data is sent: the DMA controller is configured to write into the FIFO queue and then wait until the PWM controller has finished sending data, and this creates a very precise delay.
The resolution of the resulting pulse is given by the duration of the PWM transfer, which depends on the frequency at which the PWM controller is running.
Example
We have a maximum resolution of 1ms (given by the PWM delay), and we want to have a pulse of 25% duty cycle with frequency 125Hz. The period of a pulse is thus 8ms. The DMA operation performed will be
- Set pin to 1 (DMA write to GPIO mem)
- Wait 1ms (DMA write to PWM FIFO)
- Wait 1ms (DMA write to PWM FIFO)
- Set the pin to 0 (DMA write to GPIO mem)
- Wait 1ms (DMA write to PWM FIFO)
- ...repeat "Wait 1ms" 4 more times.
- Wait 1ms (DMA write to PWM FIFO) and jump back to 1.
This will thus require at least 10 DMA control blocks (8 wait instructions, given by period / delay plus 2 write operations).
Note: in ServoBlaster and RPIO, it will consume exactly 16 DMA control blocks, because (for higher precision), they always perform a "memory copy" operation before a "wait operation". The "memory copy" operation is just a dummy unless it needs to change the pin value.