New to AWS and found out that with AWS-SDK I can't get multiple objects of S3 at one request. I could loop the get request, but that would take a long time with a single function. I heard that Lambda can run multiple functions at once and that SQS could help me with that.
So how would you set up a Lambda and SQS system that sums all digits found in all files of a S3 bucket?
In example, if I have 6000 files in a bucket, a first lambda will count them, then send a message to SQS with the number of files, then SQS will trigger a lambda that will run until just before it times out, pass the sum of digits found in the files it read with a message to SQS that will trigger the next lambda passing it the sum and the last index it read, and so on until all files are read and summed - the last lambda will return the total sum
Maybe better - the first lambda will fire several parallel lambdas that will each upon completion add to a sum somewhere, and in the end the sum will be returned to me. If this sounds logical
maybe by chunking the objects in S3The basic idea is not bad, but queuing will make the processing asynchronous (not necessarily faster). And at the end you need to aggregate the results which creates additional complexity with your proposed approach. This task (summing over all files) is a good example for the map-reduce approach - gusto2