0
votes

We are evaluating SNS for our messaging requirements to integrate multiple applications. we have a single producer that publishes messages to multiple topics on SNS. Each topic has 2-5 subscribers. In event of subscriber failures (down for maintenance) I have a few questions on the recommended strategy of using SQS queues per consumer

  1. Is it possible to configure SNS to push to SQS only in event of failure in delivering the message to a subscriber? Dumping all the messages in SQS queue creates a problem for the consumer to analyze all messages in the queue when it restarts.
  2. In event of subscriber failure, it can read messages from SQS queue on restart but how would it know that it missed messages from SNS when it was overloaded?

Any suggestions on handling subscriber failures are welcome.

Thanks!

1

1 Answers

1
votes

No, it is not possible to "configure SNS to push to SQS only in event of failure".

Rather than trying to recover a message after a failure, you can configure the Amazon SNS retry policies.

From Setting Amazon SNS Delivery Retry Policies for HTTP/HTTPS Endpoints:

You can use delivery policies to control not only the total number of retries, but also the time delay between each retry. You can specify up to 100 total retries distributed among four discrete phases. The maximum lifetime of a message in the system is one hour. This one hour limit cannot be extended by a delivery policy.

So, you don't need to worry as long as the destination is back online within an hour.

If it is likely to be offline for more than an hour, you will need to find a way to store and "replay" the messages, possibly by inspecting CloudWatch Logs.

Or, here's another idea...

Push initially to SQS. Have an AWS Lambda function triggered by SQS. The Lambda function can do the 'push' that would normally be done by SNS. If it fails, then the standard SQS invisibility process will retry it later, eventually going to a Dead Letter Queue.