3
votes

I have defined an SQS trigger to my Lambda. Inside the Lambda, I am calling a 3rd party api which is based on tokens(250 tokens per minute). Initially I had defined a batch size of 250, batch window of 65s, but what happened was that the lambda worked concurrently to process the requests and the tokens got exhausted very fast.

Then after changing various values of batch size, window and concurrency, finally the process started working smooth with batch size 10, batch window 10 and reserved concurrency 7 but that time there were only 3,00,000 product ids in the queue. Yesterday when I pushed 4 million product ids to the queue, again the tokens started getting exhausted very fast. When I checked the logs, I found that Lambda is picking different number of messages at different intervals like sometimes for a minute it takes 200 messages and sometimes 400. This number differs every time.

What I want is that Lambda should pick only 250 messages from the queue in 1 minute no matter how many messages are there in the queue. How to do this?

enter image description here

2
How come it got 200 or 400 messages if you batch size is 10? - Marcin
Yes that's the problem. I have uploaded the logs. You can check it. - nats
@Marcin Those 200 and 400 were per minute, not batch size. And those values seem ok, if you set batch size and window both to 10. - Jens
The max I can count is 350 messages = 7 concurrency * 10 messages in batch * 5 threads. But if your function runs less then 1 minute, then you can easlily go over 400. But anyway, the answer is that you can't control this fully. Its up to AWS how it pools and scales the pooling. - Marcin
Yes @Maurice. With that approach also, the number of messages picked by lambda were different everytime. - nats

2 Answers

2
votes

The long story short is that you don't have full control over how lambda pools SQS. This is made clear by AWS rep in SQS Lambda Trigger Polling Issue :

Since this is entirely dependent on the Lambda service, the polling mechanism cannot be controlled.

What's more, lambda uses five pooling threads:

the Lambda service will begin polling the SQS queue using five parallel long-polling connections.

So with your setup, you can get easily thusdands of pools a minute (depending on how long a function lasts):

7 concurrency * 10 messages in batch * 5 threads * 6 pools per minute = 2100 per minute

As the AWS rep writes, the only way to combat this issue is not to use SQS with lambda directly:

The only way to mitigate this is to disable the SQS triggers on the Lambda function.

2
votes

I don't think SQS is the right product for this kind of problem. What you are looking for is throttling and SQS is probably not the right tool for this.

For example. You set the batch size to 10 and window to 10. That does not mean what you think that it means.

You are telling SQS to batch a maximum of 10 items for a maximum of 10 seconds. But if SQS has 10 items after 1 second, it will trigger your Lambda.

Looking at your requirements, it looks like you are putting a lot more data into the queue, than you can read from it.

Considering this, I would propose you write that data to DynamoDB first and then have a job, triggered by EventBridge, that runs every minute and picks up exactly 250 items (or for whatever amount you have tokens) from DynamoDB and do the work.

In summary:

  1. Put your items into SQS
  2. Trigger Lambda A from SQS
  3. Lambda A will write it to DynamoDB
  4. Create EventBridge rule to trigger a Lambda B every 60 seconds
  5. Lambda B reads n items from DynamoDB and processes them