0
votes

I have an AWS Lambda function that reads a CSV file and saves a bunch of records to an SQS queue. There will be thousands of records so they won't fit in one message.

Then there is another Lambda function triggered by that queue that process each record. Each record will take around a second to process.

I need to send an email after all records have been processed.

What's the best way to do it?

I had two ideas:

  1. Make it a FIFO queue and add a LastRecord: true attribute to the last record. Send the email when I read that record.

  2. Update a DynamoDB table with the total number of records and the total number of processed records and a Lambda function reading the table until total === processed.

Any better ideas?

1
How long does the second Lambda function take to process the records? Could the first Lambda function send all the records in one message to SQS, instead of in separate messages? This would effectively be the same as using a FIFO queue, but it would run as one Lambda function instead of many. - John Rotenstein
Thanks @JohnRotenstein , I've edited the question. Around a second for the 2nd lambda function and there might be thousands of records so they won't fit in one message. - LorenzoR
I don't think SQS trigger is available for FIFO. I think you can create a lambda that's polling SQS. If no item is returned you can send email. You may also consider Step Functions. When one batch of items are added in the queue you can execute state machine. I would imagine state machine with 4 steps. First to poll the messages and then pass the messages to the next step that would process all the records. Then back to Polling task until items are in the queue. When no item is returned then final step would be an SNS task to send out email notification. - A.Khan
SQS Trigger is now available for FIFO as of November 2019: aws.amazon.com/blogs/compute/… - hackerrdave

1 Answers

0
votes

Since message order is relevant, you should consider using Amazon Kinesis Data Streams instead of Amazon SQS.

Some differences:

  • Messages are kept in-order
  • Messages can be replayed (they are retained for a period)
  • You can trigger an AWS Lambda function from a Kinesis stream
  • Multiple streams can be used for parallel processing
  • Pricing for Kinesis is per shard hour, while SQS is per request (eg send, receive, delete)

You could send your "last message" signal to trigger the final email send.