1
votes

I'm deploying a sample fast api app to the cloud with google standard app engine model. The app is served with gunicorn this way:

gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:80

This command spawns 4 worker proccesses of my app.

I've read that in fast api you can either create sync or async endpoints. If an endpoint is async all requests run on a single thread with the event loop. If the endpoint is sync, it runs the function on another thread to prevent it from blocking the server.


I have sync blocking endpoints, so fastapi should run them on threads, but also i have gunicorn spawning worker proccesess.

Given that python only executes one thread at a time, but also the standard app engine is also limited CPU wise on multiple proccessing, i'm confused on the best configuration for a fastapi application on the cloud.

Should i let gunicorn or fastapi handle the concurrency?

1

1 Answers

0
votes

The number of workers you specify should match the instance class of your App Engine app; and since you're using 4 workers in your app, it has an equivalence of 4 instance classes. Here's an example that shows an App Engine deployment that uses 4 gunicorn workers for serving apps: entrypoint: gunicorn -b :8080 -w 4 main:app. The examples I've provided was stated in the entrypoint best practices.

Just a note, the gunicorn uses sync workers by default so that worker class is compatible with all web applications, but each worker can only handle one request at a time.

Lastly if using Google App Engine Flex, kindly check the recommended gunicorn configurations for further guide in your app.