I find some strange behavior with JPA. In some case we got an error
"Timed out waiting for a free available connection."
at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(DefaultConnectionStrategy.java:88) ~[bonecp.jar:na]
Source for that line available at: https://github.com/wwadge/bonecp/blob/master/bonecp/src/main/java/com/jolbox/bonecp/DefaultConnectionStrategy.java#L88
I did the simple research and found this:
- Play2 uses actor model with akka.
- Play2 uses bonecp for database connection pool.
For process @Transaction request Play2 uses at least 2 separate actors:
One actor for get connection from pool and process the call.
Relevant code:
- https://github.com/playframework/playframework/blob/master/framework/src/play-java-jpa/src/main/java/play/db/jpa/TransactionalAction.java
- https://github.com/playframework/playframework/blob/master/framework/src/play-java-jpa/src/main/java/play/db/jpa/JPA.java#L180
- https://github.com/playframework/playframework/blob/master/framework/src/play/src/main/scala/play/core/j/JavaAction.scala#L61
Second actor for commit or rollback transaction. Connection released at this moment.
Relevant code:
Both actors are executed from one akka executor and thread pool. Look: http://www.playframework.com/documentation/2.2.x/ThreadPools
By default there are 24 threads.
- By default 30 connections are avalible for bonecp pool. Look: docs at http://www.playframework.com/documentation/2.2.x/SettingsJDBC and code at https://github.com/playframework/playframework/blob/master/framework/src/play-jdbc/src/main/scala/play/api/db/DB.scala#L372
Let's assume that we have a high load application that must process many concurrent requests. At my tests i have problems with 50+ concurrent requests for default configuration. You can repeat my test and catch this problem with only 3 concurrent requests with non default configuration like this:
play {
akka {
akka.loggers = ["akka.event.Logging$DefaultLogger", "akka.event.slf4j.Slf4jLogger"]
loglevel = DEBUG
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 1.0
parallelism-min = 1
parallelism-max = 1
}
}
}
}
}
...
db.default.minConnectionsPerPartition=2
db.default.maxConnectionsPerPartition=2
...
Complexity and processing time do not matter for this test. My processing time is like 7 ms (parsing params + one insert to database). You only need to send more concurrent requests than connection pool size. Now i write what i think is happening.
- Many actors captured threads and connections for processing request.
Many actors trys to get connect from pool. But there are no free connections that available at this moment (all other actors captured them). This connection pool based on BlockingQueue. Code: https://github.com/wwadge/bonecp/blob/master/bonecp/src/main/java/com/jolbox/bonecp/DefaultConnectionStrategy.java#L82
So current thread is blocked for a timeout at this line (one second by default).
- If some threads are blocked by this actors then some actors can not be executed with this threads.
- So less connections are returned to connection pool because less actors can be executed with less number of threads.
- At some time all threads will be blocked. And there is not any threads available for execute actors that shall return connections to the pool.
That looks like deadlock.
Can someone give me an advice how can i avoid this problem ?