Serverless Database Connection Pooling

Question

I’m trying to build an application on aws that is 100% serverless (minus the database for now) and what I’m running into is that the database is the bottleneck. My application can scale very well but my database has a finite number of connections it can accommodate and at some point, my lambdas will run into that limit. I can do connection pooling outside of the handler in my lambdas so that there is a database connection per lambda container instead of per invocation and while that does increase the number of concurrent invocations before I hit my connection limit, the limit still exists.

I have two questions. 1. Does serverless aurora solve this by autoscaling to increase the number of instances to meet the need for more connections. 2. Are there any other solutions to this problem?

Also, from other developers interested in serverless, am I trying to do something that’s not worth doing? I love how easy deployment is with serverless framework but is it better just to work with Microservices in something like Kubernetes instead?

I ended up doing a couple of things to try to mitigate the issue. First, I used cloudwatch to keep my lambdas warm, thereby avoiding a bunch of open and unused connections. There’s a great serverless framework plugin for that called server less-plugin-warmup. I also reduced the number of necessary db connections used by moving my more ephemeral data to a redis on elasticache and trying to ensure each lambda only accessed one datastore. This doesn't solve the problem outright but it pushes it a little further back. — Matt Clevenger

b.b3rn4rd b.b3rn4rd · Accepted Answer · 2018-10-23T00:37:17

I believe there are two potential solutions to that problem:

The first and the simplest option is to take advantage of "lambda hot state", it's the concept when Lambda reuses the execution context for subsequent invocations. As per AWS suggestion

Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.

Basically, while the lambda function is the hot stage it "might/should" reuse opened connection(s).

The limitations of the following:

you only reuse connection for single lambda type, so if you have 5 lambda functions invoked all the time you still will be using 5 connections
when you have a spike in lambda invocations, including parallel executions this approach becomes less effective since, lambda will be executed in a new execution context for majority of requests

The second option would be to use a connection pool, connection pool is an array of established database connections, so that the connections can be reused when future requests to the database are required.

While the second option provides a more consistent solution it requires much more infrastructure.

you would be required to run a separate instance for the pool, and if you want to do things properly probably at least two instances and a load balancer (unless use containers).

While it might be overwhelming to provision that much additional infrastructure for connection pooler, it still might be a valid option depending on the scale of the project, your existing infrastructure (may be you already using containers) and cost benefits

Serverless Database Connection Pooling

2 Answers