I'm developing an application for a custom board with a STM32F1 MCU which needs to be able to recover from a unexpected data corruption.
The data flow is as follows: The master device (a Linux machine) sends a request to the slave which parses the message and gets ready to send a reply. Then the master reads the reply. The exchange is fast (@18MHz) and implemented like this:
if (::ioctl(_fd, SPI_IOC_MESSAGE(2), &transaction) < 0) {
warn("message not sent");
return false;
}
The delay between these two messages is ~50us. The message length is fixed.
On the STM side I use a DMA-driven SPI driver that is implemented in the way I'm going to write below.
I'm using the SPI2 which is clocked off APB1@36MHz (HSE@24 MHz; AHB@72MHz; APB1@36MHz).
After the SPI is configured to read the message (fixed length!) by issuing a DMA request on RXNEIE (CR2->RXDMAEN). After the message is processed the answer is getting transmitted via DMA1 (CR2->TXDMAEN).
Everything works like a charm until I interfere somehow. The scenario I'm trying to recover from is unplugging SCLK line while transferring.
I'm struggling to recover from this. I'm going to lay out my thoughts because I'm not sure where the bug is.
The DMA is configured to handle fixed length messages. That's why when I interfere somehow, the DMA controller waits until the whole message is processed and the buffer gets shifted. Suppose, I got a one third of the message when the SCLK suddenly vanished. DMA will be waiting for the rest two thirds. The master continues to send requests. Hence after SCLK is back, 2/3 of the next message will be placed in the buffer. The DMA interrupt is issued but the remaining trail of the last message is lost. It's lost for sure, but I can detect that using an ERRIE flag to issue an interrupt on OVR flag that is going to be set.
I've tried to handle that interrupt but to no avail.
The interrupt handler I have now checks if BSY flag is set (the trail is getting process by SPI controller). If it's set I kill DMA (that already starts to handle the next message) and leave OVR flag. Once BSY is cleared I clear OVR and reset DMA for reception.
This doesn't help much.
Another option I might use is a dedicated timer that gets reset on rising edge on SCLK (AN3109 application note inspired solution). This way I could implement a DMA timeout. If I got only the part of a message I can generate an interrupt on timer overflow if SCLK is not with us for a long time. This solution has issues, though.
I know the description is vague but I've tried my best and hope somebody with a greater insight might help.
SS
as frame-sync. You need timeouts. – too honest for this site