2
votes

Why does Spring Batch use 1 database connection for each thread?

Stack:

  • Java 8
  • Spring Boot 1.5
  • Spring Batch 3.0.7
  • HikariCP 2.7.6

DataSource config:

  • batcdb (postgres)
  • readdb (oracle)
  • writedb (postgres)

Each datasource is using HikariCP with default 10 connections each.

Spring Batch config: ThreadExecutor-1:

core-pool-size: 10
max-pool-size: 10
throttle-limit: 10

Job-1 Config / ThreadPoolTaskExecutor: (pool sizes and throttle limit set via application.yml)

@Bean
public Step job1Step() {
    return stepBuilderFactory.get("job1Step")
            .<ReadModel, WriteModel>chunk(chunkSize)
            .reader(itemReader())
            .processor(compositeProcessor())
            .writer(itemWriter())
            .faultTolerant()
            .taskExecutor(job1TaskExecutor())
            .throttleLimit(throttleLimit)
            .build();
}

@Bean
public ThreadPoolTaskExecutor job1TaskExecutor() {
     ThreadPoolTaskExecutor pool = new ThreadPoolTaskExecutor();
     pool.setCorePoolSize(poolSize);
     pool.setMaxPoolSize(maxPoolSize);
     pool.setWaitForTasksToCompleteOnShutdown(false);
     return pool;
 }

@Bean
@StepScope
public Job1ItemReader job1ItemReader() {
    return new Job1ItemReader(readdb, pageSize);
}

Abbreviated Code for Job1-ItemReader

public class Job1ItemReader extends JdbcPagingItemReader<ReadModel> {
...
}

ThreadExecutor-2:

core-pool-size: 5
max-pool-size: 5
throttle-limit: 5

Job-2 Config / ThreadPoolTaskExecutor:

@Bean
public Step job2Step() throws Exception {
    return stepBuilderFactory.get("job2Step")
            .<ReadModel2, WriteModel2>chunk(chunkSize)
            .reader(job2ItemReader())
            .processor(job2CompositeProcessor())
            .writer(job2ItemWriter())
            .faultTolerant()
            .taskExecutor(job2TaskExecutor())
            .throttleLimit(throttleLimit)
            .build();
}

@Bean
public ThreadPoolTaskExecutor job2TaskExecutor() {
    ThreadPoolTaskExecutor pool = new ThreadPoolTaskExecutor();
    pool.setCorePoolSize(corePoolSize);
    pool.setMaxPoolSize(maxPoolSize);
    pool.setQueueCapacity(queueCapacity);
    pool.setWaitForTasksToCompleteOnShutdown(false);
    return pool;
}

@Bean
@StepScope
public Job2ItemReader job2ItemReader() {
    return new Job2ItemReader(readdb, pageSize);    
}

Abbreviated Code for Job2-ItemReader

public class Job2ItemReader extends JdbcPagingItemReader<ReadModel2> {
...
}
  • There are 2 jobs
  • Job-1 is long-running (multiple days)
  • Job-2 usually completes in a hour or 2, and runs on a schedule every day
  • Jobs are in the same 'application', running on the same JVM
  • Each Job has its own ThreadPoolTaskExecutor defined

When Job-1 is running, and Job-2 starts, Job-2 is unable to get a connection to the readdb. The following error is thrown by the Batch Reader of Job-2.

Caused by: org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:339)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:366)
at org.springframework.batch.support.DatabaseType.fromMetaData(DatabaseType.java:97)
at org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean.getObject(SqlPagingQueryProviderFactoryBean.java:158)
... 30 common frames omitted
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:326)
... 33 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:666)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:182)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:147)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:123)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)

(edited to protect the innocent)

Ref:

1
Could it be that your code is simply not returning connections back to the pool? You haven't shared any code, so it's hard to tell. - Mick Mnemonic
I rely on Spring Framework to manage the database connections; I don't explicitly open or close any db connections in my code. I'll try to add a few more details about the framework portions that are managing the db connections. - shawnjohnson
can you post how you are defining taskExecutors ? - The Guest

1 Answers

5
votes

The reason Spring Batch uses one database connection per thread (it can actually use more in certain situations), is due to transactions. Spring transactions are tied to a thread. Just about everything within Spring Batch happens within a transaction. So when you have a single job with a single thread, you'll use only a couple connections at most. However, if you have a multithreaded step, expect at least one connection per thread for the transaction handling.