I want to delay processing of messages from an AWS Kinesis stream for one hour. I've configured the KCL consumer to read a batch of records every four minutes, check the timestamp of each record, and stop processing the batch if any of the records is less than an hour old, without checkpointing. I was hoping that the same consumer instance would reread the same messages every four minutes until the entire batch is old enough to process, followed by checkpointing the consumer. However, in practice, the consumer reads the messages only one time, which means that they are ignored and never read again when they are ready to process. Is there a way to configure the consumer to reread all messages from the last checkpoint every time?
1 Answers
I would love something like that (a configuration that could delay the delivery of an event) out of the box from AWS Kinesis Stream. In absence of that, there is a way you could delay the processing of the event at a cost of wasted compute.
Use SQS (or FIFO SQS, if you care about event ordering) instead of Kinesis or use an AWS Lambda on Kinesis Stream to transfer the events to SQS. SQS supports delaying the delivery of a message up to 15 minutes. Since you need the delay to be 60 minutes, you can run another Lambda (or your own SQS consumer) to process the message. On the first delivery of a message to Lambda (or your SQS consumer) do not process the message but simply set the visibility timeout of the message to 45 minutes (adding up to your required delay of 60 minutes). Process the SQS message only after you have received it the second time. You can check how many times a message has been delivered before to make your decision about if you want to process or skip processing the message.