2
votes

We are building an application which uses heavy backend tasks (Task queues), And in each task - we are doing I/O in Google Cloud SQL.

As GAE have limitation for 12 concurrent connections (not sure whether this is issue? I saw at https://stackoverflow.com/a/26155819/687692)

""Each App Engine instance running in a Standard environment or using Standard-compatible APIs cannot have more than 12 concurrent connections to a Google Cloud SQL instance." - https://cloud.google.com/sql/faq"

My most of the backend tasks (100-500 tasks per second) are failing because of this issue.

Failing tasks

Also, I checked active connection for last 4 days: I dont see any of the connection is going more than 12 connections.

Last 4 days - GCS

So, What approach i need to take to fix this? Connection pooling (How to do it in GAE with GCS?) ? or some other fix?

Any help - guidance is very much appreciated. Let me know, if any one need more information.

Thanks,

1
The error you mention does usually indicate that the 12 connection limit has been reached. Note that the limit is 12 concurrent connections per /App Engine instance/, not for the whole application. Do you know how many task queue requests each App Engine instance in your application is handling? Would setting a maximum max_concurrent_requests work for you as per the linked post? (Note that setting a lower max_concurrent_requests value may cause more App Engine instances to be spun up to handle requests).Vadim
max_concurrent_requests are for front-end instances. My Tasks queue generally runs 100-500 tasks in one second. (I hope this is not huge numbers for big projects)Love Sharma
max_concurrent_requests applies to any module using automatic scaling. Can you describe your setup a bit more? Are you using backends (cloud.google.com/appengine/docs/python/backends) or a separate module for your task handlers? How many tasks is each instance handling per second (not overall number of tasks)?Vadim
I am using Task Queue (cloud.google.com/appengine/articles/deferred). which is handled by multiple queues - I have around 10 queues for different type of tasks and handled differently. Each queue has limit of 500/s (bucket 500) - I guess, this is max. ``` - name: reminder rate: 500/s bucket_size: 500 retry_parameters: task_retry_limit: 2 ``` Above is one example. I have cron job - which add tasks to these queues. almost 500 tasks.Love Sharma
What kind of scaling are you using for the module that is handling the tasks? I wrote a small load test app that schedules 10,000 tasks that perform some database work (and hold the connection open for a short time to increase contention). The test is able to finish with 0 errors. I'm interested to know if there's anything different in your setup.Vadim

1 Answers

1
votes

It's not likely that you would exceed the 12 connection limit with the standard Python App Engine scaling settings if the connections are handled properly.

To demonstrate, I have created a small application that schedules many tasks, with each task acquiring a database connection and doing some work. I am able to run this test without hitting connection issues.

One thing worth making sure is that you are not leaking any connections (i.e. not closing the connection in some places or when exceptions happen).

For MySQLdb, you can guarantee you are not leaking connections by using closing from contextlib:

from contextlib import closing

def getDbConnection():
    return MySQLdb.connect(unix_socket='/cloudsql/instance_name', db='db', user='user', charset='utf8') 

with closing(getDbConnection()) as db:
    # do stuff, database is guaranteed to be closed