0
votes

I have an AWS Lambda Function setup with a trigger from a SQS queue. Current the queue has about 1.3m messages available. According to CloudWatch the Lambda function has only ever reached 431 invocations in a given minute. I have read that Lambda supports 1000 concurrent functions running at a time, so I'm not sure why it would be maxing out at 431 in a given minute. As well it looks like my function only runs for about 5.55s or so on average, so each one of those 1000 available concurrent slots should be turning over multiple times per minute, therefor giving a much higher rate of invocations.

How can I figure out what is going on here and get my Lambda function to process through that SQS queue in a more timely manner?

1
Just checking -- are you using the brand new "SQS to Lambda" functionality? Are you saying that Lambda eventually gets triggered (but takes some time), or that it never gets triggered? Is that CloudWatch metric a SUM() or an AVG()? - John Rotenstein
@JohnRotenstein Yes. I think eventually it gets triggered. But currently I'm adding items to the queue faster than they are getting triggered with Lambda. I'm not sure if it's sum or average. One other note, at one point when messing around with things today, my average duration for the Lambda function dropped below 500ms, and the number of invocations per minute shot up to like 18k or so per minute. I THINK it was due to a bug in my code that I was testing because I realized it wasn't totally working correctly. But it seems like there is a correlation between duration and invocations. - Charlie Fish
The documentation seems to imply that the process is adaptive. How long has this been running? - Michael - sqlbot
What I actually intended to ask was how long the SQS/Lambda integration had been configured (running was not the best choice of word), since the docs seem to suggest that it is adaptive. How many messages do you typically see "in flight" for the queue in the SQS console or in CloudWatch? - Michael - sqlbot
I was able to confirm that "inflight" as used in that blog post does not mean the same thing as inflight from the SQS Developer Guide. The post may get a refresh to clarify. Scaling should be happening. Note that the 1000 concurrency limit is per region, not per function, so if you have other functions also running, you might be hitting that regional limit and need to ask for an increase. - Michael - sqlbot

1 Answers

0
votes

The 1000 concurrent connection limit you mention assumes that you have provided enough capacity.

Take a look at this, particularly the last bit. https://docs.aws.amazon.com/lambda/latest/dg/vpc.html

If your Lambda function accesses a VPC, you must make sure that your VPC has sufficient ENI capacity to support the scale requirements of your Lambda function. You can use the following formula to approximately determine the ENI capacity.

Projected peak concurrent executions * (Memory in GB / 3GB)

Where:

Projected peak concurrent execution – Use the information in Managing Concurrency to determine this value.

Memory – The amount of memory you configured for your Lambda function.

The subnets you specify should have sufficient available IP addresses to match the number of ENIs.

We also recommend that you specify at least one subnet in each Availability Zone in your Lambda function configuration. By specifying subnets in each of the Availability Zones, your Lambda function can run in another Availability Zone if one goes down or runs out of IP addresses.

Also read this article which points out many things that might be affecting you: https://read.iopipe.com/5-things-to-know-about-lambda-the-hidden-concerns-of-network-resources-6f863888f656

As a last note, make sure your SQS Lambda trigger has a batchSize of 10 (max available).