0
votes

I'm currently uploading a file into AWS S3 Bucket(B1) with 250 MB size & 1 Million records. This triggers a Lambda (L1 - 1.5GB, 3 Mins) which reads this file & grouping the records with some criteria & writing about 25K files into S3 again on the different bucket(B2).

Now, notification event configured on Bucket (B2) generates 25K events(requests) to different Lambda (L2 - 512MB, 2Mins, Concurrency-2). This Lambda calls a Java-based micro-service which makes an entry into DB after processing which takes about 1-2 seconds for each call.

The problem here is, Once 2nd Lambda (L2) is triggered, there's no way to stop it. It runs for hours & not receiving any other event for the same lambda until processing all events completely & I've no control over S3 events triggered already.

Can someone please explain how events triggered on S3 upon file upload being processed (architecture) on Amazon S3 & how to get fine-grained control over S3 events triggered?

Is there anything I can do on AWS Lambda side to stop S3 events triggered already?

1

1 Answers

0
votes

I don't think setting a notification event on B2 is the best option when you are writing 25K objects at a time. I think process can be simplified.

  • Lambda L1 that writes 25K objects in B2 can create an array of object keys being written and put that in B2. Make sure it is written in a separate folder and notification event is set on that folder and not in the location where 25K files being written.

  • L2 will be triggered when you write file containing keys of 25K objects which it can pass to your microservice.

Another option using SNS

  • Lambda L1 that writes 25K objects in B2 can create an array of object keys being written and publish it to an SNS Topic. SNS message size is 256 KB which is enough for your use case

  • You mircoservice can subscribe to SNS Topic to receive the object keys and make entries in the DB