Why is non-blocking web request efficient as we are holding a server thread in both cases

Question

The question is about Play framework specifically although concept is generic.

Quoting from:

https://www.playframework.com/documentation/2.6.18/ScalaAsync

The web client will be blocked while waiting for the response, but nothing will be blocked on the server, and server resources can be used to serve other clients.

Using a Future is only half of the picture though! If you are calling out to a blocking API such as JDBC, then you still will need to have your ExecutionStage run with a different executor, to move it off Play’s rendering thread poo

I understand the part that the original web application threads will be freed however another thread will still be needed to actually perform the cpu intensive action and then calculate the result, which will be propagated to the client (which is blocked meanwhile).

How is better than synchronously performing the execution in the play's action code? We would have to increase the number of threads (as the blocking request will consume threads), however total number of active threads on the server will remain the same.

Can someone also throw light on how does Play track the blocked client thread and returns the response in the non-blocking action scenario?

Hi Asad, I've answered your main question. It looks like you have another question at the end there. If you still want that answered, can you create another question for it, so that we have just one question per question? — Brian McCutchon

Brian McCutchon Brian McCutchon · Accepted Answer · 2018-09-06T02:11:16

Using different thread pools for rendering and long-running operations is desirable because that way the long-running operations can use all of the threads in their pool without blocking rendering.

Imagine this situation:

10 clients make requests for resources that require long-running operations.
Then a client tries to access a resource that doesn't.

Here are two ways that this could be handled:

You have a pool with 10 threads used for everything. These fill up doing your long-running operations, and the other client — who has a simpler request! — has to wait for one of the long-running calls to finish.
You have two thread pools, one with 5 threads used for rendering and another with 5 threads used for long-running operations. The rendering threads quickly give the long-running operations work to the other pool, freeing them to respond to the eleventh client's request.

The second situation is definitely better, but I would like to point out another reason for having multiple thread pools: sometimes different operations require different kinds of system resources. For example, rendering might be CPU-bound, while database calls might be mostly network-bound, or CPU-bound but done on a different machine (the database server). If you use the same thread pool for both, the threads might get busy waiting for network calls to finish while the CPU sits mostly idle, even if you have several CPU-bound tasks queued. That would be an inefficient use of resources, so you should have different thread pools for different kinds of tasks.

Why is non-blocking web request efficient as we are holding a server thread in both cases

1 Answers