How can the AWS Lambda concurrent execution limit be reached?

Question

UPDATE

The original test code below is largely correct, but in NodeJS the various AWS services should be setup a bit differently as per the SDK link provided by @Michael-sqlbot

// manager
const AWS = require("aws-sdk")
const https = require('https');
const agent = new https.Agent({
    maxSockets: 498 // workers hit this level; expect plus 1 for the manager instance
});
const lambda = new AWS.Lambda({
    apiVersion: '2015-03-31',
    region: 'us-east-2', // Initial concurrency burst limit = 500
    httpOptions: {   // <--- replace the default of 50 (https) by
        agent: agent // <--- plugging the modified Agent into the service
    }
})
// NOW begin the manager handler code

In planning for a new service, I am doing some preliminary stress testing. After reading about the 1,000 concurrent execution limit per account and the initial burst rate (which in us-east-2 is 500), I was expecting to achieve at least the 500 burst concurrent executions right away. The screenshot below of CloudWatch's Lambda metric shows otherwise. I cannot get past 51 concurrent executions no matter what mix of parameters I try. Here's the test code:

// worker
exports.handler = async (event) => {
    // declare sleep promise
    const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

    // return after one second
    let nStart = new Date().getTime()
    await sleep(1000)
    return new Date().getTime() - nStart; // report the exact ms the sleep actually took
};

// manager
exports.handler = async(event) => {
    const invokeWorker = async() => {
        try {
            let lambda = new AWS.Lambda() // NO! DO NOT DO THIS, SEE UPDATE ABOVE
            var params = {
                FunctionName: "worker-function",
                InvocationType: "RequestResponse",
                LogType: "None"
            };
            return await lambda.invoke(params).promise()

        }
        catch (error) {
            console.log(error)
        }
    };

    try {
        let nStart = new Date().getTime()
        let aPromises = []

        // invoke workers
        for (var i = 1; i <= 3000; i++) {
            aPromises.push(invokeWorker())
        }

        // record time to complete spawning
        let nSpawnMs = new Date().getTime() - nStart

        // wait for the workers to ALL return
        let aResponses = await Promise.all(aPromises)

        // sum all the actual sleep times
        const reducer = (accumulator, response) => { return accumulator + parseInt(response.Payload) };
        let nTotalWorkMs = aResponses.reduce(reducer, 0)

        // show me
        let nTotalET = new Date().getTime() - nStart
        return {
            jobsCount: aResponses.length,
            spawnCompletionMs: nSpawnMs,
            spawnCompletionPct: `${Math.floor(nSpawnMs / nTotalET * 10000) / 100}%`,
            totalElapsedMs: nTotalET,
            totalWorkMs: nTotalWorkMs,
            parallelRatio: Math.floor(nTotalET / nTotalWorkMs * 1000) / 1000
        }
    }

    catch (error) {
        console.log(error)
    }
};

Response:
{
  "jobsCount": 3000,
  "spawnCompletionMs": 1879,
  "spawnCompletionPct": "2.91%",
  "totalElapsedMs": 64546,
  "totalWorkMs": 3004205,
  "parallelRatio": 0.021
}

Request ID:
"43f31584-238e-4af9-9c5d-95ccab22ae84"

Am I hitting a different limit that I have not mentioned? Is there a flaw in my test code? I was attempting to hit the limit here with 3,000 workers, but there was NO throttling encountered, which I guess is due to the Asynchronous invocation retry behaviour.

Edit: There is no VPC involved on either Lambda; the setting in the select input is "No VPC".

Edit: Showing Cloudwatch before and after the fix

What's the configuration of your AWS Lambda function? Is it in VPC? — Dunedan
"which I guess is due to the Asynchronous invocation retry behaviour." You are using InvocationType: "RequestResponse" -- that means synchronous, not asynchronous, even if your handler is an async function. The service isn't retrying. But, if you are running the invoker as a lambda function, too, then unless that invoker function's container has a lot of CPU cycles available (which you can get by bumping up the memory) it likely does not have the resources to generate, sign, and submit enough simultaneous requests to properly perform the test. Maybe run that in EC2. — Michael - sqlbot
D'oh. That's the SDK. "When using the default of https, the SDK takes the maxSockets value from the globalAgent. If the maxSockets value is not defined or is Infinity, the SDK assumes a maxSockets value of 50." — Michael - sqlbot
@Michael-sqlbot ROFL! Its gonna take a bit to recover from that one LOL. So, actually, that is quite handy then isn't it! Knowing that us-east-2 will only give you 500 on the initial burst, one could set this to 495 and NEVER WORRY about hitting AWS's throttle! Node is caching beyond maxSockets which (may) cause memory concerns with large payloads, so there's that little gotcha, but that's likely minor. As my tests show, there is negligible performance gain above 1024MB. Ok, re-writing this test now..... — Geek Stocks
@Michael-sqlbot - well your Homer Simpson link has me doing Tim Allen power tool noises right now! Check out the updated screenshot. The modified code has smashed the parallelRatio from 0.022 to 0.008! That's fun!! If you do a write up in answer form I can get ya checked. I visit my kid at OSU often but don't quite make it down to "Who Dey" county. The next time I do, I owe you some beers!! Thank you. — Geek Stocks

Michael - sqlbot Michael - sqlbot · Accepted Answer · 2019-02-12T15:21:43

There were a number of potential suspects, particularly due to the fact that you were invoking Lambda from Lambda, but your focus on consistently seeing a concurrency of 50 — a seemingly arbitrary limit (and a suspiciously round number) — reminded me that there's an anti-footgun lurking in the JavaScript SDK:

In Node.js, you can set the maximum number of connections per origin. If maxSockets is set, the low-level HTTP client queues requests and assigns them to sockets as they become available.

Here of course, "origin" means any unique combination of scheme + hostname, which in this case is the service endpoint for Lambda in us-east-2 that the SDK is connecting to in order to call the Invoke method, https://lambda.us-east-2.amazonaws.com.

This lets you set an upper bound on the number of concurrent requests to a given origin at a time. Lowering this value can reduce the number of throttling or timeout errors received. However, it can also increase memory usage because requests are queued until a socket becomes available.

...

When using the default of https, the SDK takes the maxSockets value from the globalAgent. If the maxSockets value is not defined or is Infinity, the SDK assumes a maxSockets value of 50.

https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/node-configuring-maxsockets.html

How can the AWS Lambda concurrent execution limit be reached?

2 Answers