Lambda scaling with SQS trigger

Question

I have defined an SQS trigger to my Lambda. Inside the Lambda, I am calling a 3rd party api which is based on tokens(250 tokens per minute). Initially I had defined a batch size of 250, batch window of 65s, but what happened was that the lambda worked concurrently to process the requests and the tokens got exhausted very fast.

Then after changing various values of batch size, window and concurrency, finally the process started working smooth with batch size 10, batch window 10 and reserved concurrency 7 but that time there were only 3,00,000 product ids in the queue. Yesterday when I pushed 4 million product ids to the queue, again the tokens started getting exhausted very fast. When I checked the logs, I found that Lambda is picking different number of messages at different intervals like sometimes for a minute it takes 200 messages and sometimes 400. This number differs every time.

What I want is that Lambda should pick only 250 messages from the queue in 1 minute no matter how many messages are there in the queue. How to do this?

How come it got 200 or 400 messages if you batch size is 10? — Marcin
Yes that's the problem. I have uploaded the logs. You can check it. — nats
@Marcin Those 200 and 400 were per minute, not batch size. And those values seem ok, if you set batch size and window both to 10. — Jens
The max I can count is 350 messages = 7 concurrency * 10 messages in batch * 5 threads. But if your function runs less then 1 minute, then you can easlily go over 400. But anyway, the answer is that you can't control this fully. Its up to AWS how it pools and scales the pooling. — Marcin
Yes @Maurice. With that approach also, the number of messages picked by lambda were different everytime. — nats

Marcin Marcin · Accepted Answer · 2021-03-19T06:39:36

The long story short is that you don't have full control over how lambda pools SQS. This is made clear by AWS rep in SQS Lambda Trigger Polling Issue :

Since this is entirely dependent on the Lambda service, the polling mechanism cannot be controlled.

What's more, lambda uses five pooling threads:

the Lambda service will begin polling the SQS queue using five parallel long-polling connections.

So with your setup, you can get easily thusdands of pools a minute (depending on how long a function lasts):

7 concurrency * 10 messages in batch * 5 threads * 6 pools per minute = 2100 per minute

As the AWS rep writes, the only way to combat this issue is not to use SQS with lambda directly:

The only way to mitigate this is to disable the SQS triggers on the Lambda function.

Lambda scaling with SQS trigger

2 Answers