The way to limit the number of threads is to not spawn them directly.
Instead use a thread pool with a fixed upper bound on the number of threads.
The modern way to do this is to use the ExecutorService API (javadoc) and instantiate the service using either Executors.newFixedThreadPool(...) (javadoc) or directly using one of the many ThreadPoolExecutor (javadoc) constructor overloads.
In this case, do we consider number of threads as 20 or 5 for dealing with the property server.tomcat.max-threads?
Threads that are created by an application or an application thread pool while processing a request do not count as "worker threads" for the purposes of that Tomcat configuration property.
It is up to the application or its thread pool to manage any threads it creates and ensure that:
- the number of these threads does not get too large,
- they don't consume too much resources (CPU, memory, etc), and
- they don't get "orphaned" and end up wasting resources on a task that is no longer needed; e.g. because the original client request timed out.
Beware that this kind of thing can easily turn into "denial of service" problem.