1
votes

I am using grpc-streaming in java. I have a long-lasting open stream where the client and server communicate simultaneously. When I call onNext to send a message, grpc buffers the message internally and will send it on the wire async'ly. Now, if the stream is lost in the middle of sending data, onError is called. I wonder what are the right practices:

  1. to find out which messages were sent successfully
  2. how to retry unsent messages

Currently, I am thinking of implementing an "ack" mechanism in the application layer where for every x items received, the receiver sends back an ack message. Then in order to implement retries, I need to buffer items on the sender side and only remove them from the buffer when the ack is received. Also, on the receiver side, I need to implement a mechanism to ignore duplicate items received.

Example: Suppose we send an ack for every 100 items sent. We receive ack on batch 3 (200-300) and then we receive an error while sending items 300-400. we try again to send items 300-400 but the client has successfully received 300-330 and it is going to receive them again. so, the client needs to ignore the first 30 items.

It is possible to implement this in the application layer. However, I am wondering if there are better practices/frameworks out there that solve this problem.

1

1 Answers

0
votes

The term often used is guaranteed delivery to describe delivery data from one place to another without loss.

Your use case is similar to trying to provide guaranteed delivery over best effort delivery transport layers like UDP. The usual approach is to acknowledge every packet, although you could devise a scheme to check at a higher level as you suggest.

You also usually want to use some form of sliding window which means you don't have to wait for the previous ack before sending the next packet - this helps avoid delays.

There is a very good overview of this approach on UDP in this answer: https://stackoverflow.com/a/15630015/334402

For your case, you will receive a response for your RPC calls which will effectively be the ack - using a sliding window would allow you make th next call before you have received the ack from the previous one.

Your duplicate delivery example is also quite common - one common way to avoid double counting or getting confused is to have packet numbers and simply discard any duplicated packets.