How to parse indeterminate length packet from MCU UART?

Question

I am attempting to attach a new microcontroller to an existing OSC (Open Sound Control) network via an ethernet-to-serial converter.

The OSC messages are sent with the following formatting:

OSC Address Pattern: /string/optional substring --> always 32-bit packed/DWORD aligned with extra NULL bytes at the end
Type Tags: int32 (i), float32 (f), string (s)
Data: for certain patterns, this is a known length; for others, it is indeterminate in length

I'm a bit confused as to how best to divide up the code and how to buffer this. In my UART receive ISR, I was hoping to just dump the incoming bytes to a SRAM buffer and then decode them in the main loop, but I'm not certain how to do this with indeterminate length packets. I've never worked with a variable length packet that didn't indicate the packet size in the header somewhere. There also doesn't appear to be any EOP/footer to indicate the completion of the packet and you are left with decoding x number of bytes based on the type tag.

Some concerns: 1. How to size the buffer? Yes, I could do this dynamically, but I'd prefer to avoid that overhead, if possible. That leads me towards a circular buffer, but how do I pull out what I need to in the main loop while continuing to decode the variable length packets? 2. How much of the decode should be done in the ISR? Do I set up some type of state machine in the main loop that advances based on the ISR decode of the incoming bytes and then flush those bytes as I advance the FSM?

Looking for any advice/guidance on how best to tackle this problem/architect the code. Thank you in advance.

In general, with indeterminate-length data, and given a serial byte stream, there is no possible answer, you are completely stuft and your task cannot be completed. — Martin James
Do not do any decoding inside the ISR. The main loop should do all the decoding/parsing. A circular buffer should be as big as needed only to avoid overrun from incoming chars, i.e., the parsing of the various recognized parts should take less time than to fill up the buffer at your given 'baud' rate. The length of this buffer has nothing to do with the length of your data stream. This problem is very similar to parsing source code with a compiler. The length of the source is unknown to the compiler while it parses more code. No compiler uses a buffer to hold the entire source. — tonypdmtr

kkrambo kkrambo · Accepted Answer · 2015-12-02T21:18:15

The UART receive ISR should simply read the incoming byte from the UART and copy it to a circular buffer. I wouldn't do any decoding/parsing/interpretation of the bytes in the ISR. The circular buffer should be large enough to contain the largest message you expect to receive (or larger if bursts of messages can arrive faster than your main loop can handle them).

The main loop should use a state machine to parse the byte stream into a message. The states could be named READ_ADDRESS, READ_TYPE, and READ_DATA. Or you may want to have a unique READ_DATA state for each message type.

From a high level view, each state should read the appropriate number of bytes from the circular buffer and then advance to the next state. It sounds like the data state handler function will have to be smart enough to calculate the appropriate number of data bytes. If you have to interpret some of the data in order to figure the data length then you might want to consider breaking the READ_DATA state into two states, one for reading enough to figure the length and the other for reading the rest of the data.

When you're designing your state handler functions, consider that the appropriate number of bytes may not have been received into the circular buffer yet. From a low level view, the state handlers should return to the main loop when not enough bytes are available. This allows the main loop to service other parts of the application while the ISR receives more bytes. Then when the your main loop calls the state machine again, the state handler should pick up where it left off.

How to parse indeterminate length packet from MCU UART?

2 Answers