Is Lambda Scaling Behavior Related to Error Rate?

Question

We've been using AWS Lambda for some time and have launched another Lambda process but this time we've noticed some unexpected behavior with regard to Lambda's scaling approach and reported error rate. We have an SQS -> Lambda setup with a max concurrency of 200 and there are approximately 100k messages in the queue averaging about 3 seconds to process each one. The Lambda is failing at about 5%-10% according to the console and this is expected in the current approach. Unfortunately what we have seen is Lambda only goes to about 45-50 or so max concurrent executions. We adjusted memory, timeout, queue settings etc, nothing worked. Finally we had Lambda always exit successfully and sure enough now it has reached maximum concurrency instantly. Why is this? This isn't mentioned anywhere in the documentation that error rate is related to concurrency or scaling behavior. Has anyone else experienced this? It kind of makes sense as some sort of safe guard for the end user but we just were not expecting it. We're adjusting our current approach to account for this theory.

tldr: Lambda has max concurrency of 200 and triggered by SQS When error rate is 5%-10% it reaches about 45-50 concurrent executions When error rate is 0% it reaches the full 200 concurrent executions Why?

It is probably due to retries. What is causing the "failures"? See: Managing Concurrency - AWS Lambda — John Rotenstein

Anas Alkhatib Anas Alkhatib · Accepted Answer · 2020-04-17T17:52:31

Yes it is. It was not documented anywhere previously but that was the behaviour I experienced as well.

Confirmed here: https://aws.amazon.com/premiumsupport/knowledge-center/lambda-sqs-scaling/

If there are any errors when Lambda attempts to invoke your function, the service prevents your function from scaling to prevent errors at scale. As soon as the errors stop, Lambda continues to scale up your function. It scales up 60 additional concurrent invocations per minute as long as your account isn't at or near the service quota for scaling or burst concurrency in the Region. Your function can scale up to a maximum of 1,000 concurrent invocations.

Is Lambda Scaling Behavior Related to Error Rate?

2 Answers