1
votes

I have a Spring-cloud-dataflow server deployed on Pivotal Cloud Foundry. On the server, runs a pipeline of three spring-batch tasks. The pipeline is encapsulated inside a composed-task.

When I launch the execution of this composed-task, the composed-task-runner starts the first batch job execution. This first batch is connected to two different datasources: a shared metadata datasource for the Spring metadata schemas (SCDF, SCT & SB) and a business datasource for my business data. The databases are MySQL. The execution of this first task works fine, however when the composed-task-runner attempts to retrieve the task execution status from the task repository (metadata datasource), it throws the following exception and stops the whole pipeline:

org.springframework.dao.DeadlockLoserDataAccessException: 
PreparedStatementCallback; 
SQL [SELECT TASK_EXECUTION_ID, START_TIME, END_TIME, TASK_NAME, EXIT_CODE, EXIT_MESSAGE, ERROR_MESSAGE, LAST_UPDATED, EXTERNAL_EXECUTION_ID, PARENT_EXECUTION_ID from TASK_EXECUTION where TASK_EXECUTION_ID = ?]; 
(conn:56675) Deadlock found when trying to get lock; 
try restarting transaction 
Query is: SELECT TASK_EXECUTION_ID, START_TIME, END_TIME, TASK_NAME, EXIT_CODE, EXIT_MESSAGE, ERROR_MESSAGE, LAST_UPDATED, EXTERNAL_EXECUTION_ID, PARENT_EXECUTION_ID from TASK_EXECUTION where TASK_EXECUTION_ID = ?, parameters [2]; 
nested exception is 
java.sql.SQLTransactionRollbackException: (conn:56675) Deadlock found when 
trying to get lock; try restarting transaction 
Query is: SELECT TASK_EXECUTION_ID, START_TIME, END_TIME, TASK_NAME,EXIT_CODE, EXIT_MESSAGE, ERROR_MESSAGE, LAST_UPDATED, EXTERNAL_EXECUTION_ID, PARENT_EXECUTION_ID from TASK_EXECUTION where TASK_EXECUTION_ID = ?, parameters [2] 
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:263) 
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) 
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:649) 
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:684) 
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:716) 
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:726) 
at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:800) 
at org.springframework.cloud.task.repository.dao.JdbcTaskExecutionDao.getTaskExecution(JdbcTaskExecutionDao.java:262) 
at org.springframework.cloud.task.repository.support.SimpleTaskExplorer.getTaskExecution(SimpleTaskExplorer.java:52) 
at org.springframework.cloud.task.app.composedtaskrunner.TaskLauncherTasklet.waitForTaskToComplete(TaskLauncherTasklet.java:146)
at org.springframework.cloud.task.app.composedtaskrunner.TaskLauncherTasklet.execute(TaskLauncherTasklet.java:123)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:406) 
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330) 
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) 
at org.springframework.batch.core.s.

The code for the multiple datasource access from my spring-cloud-task / spring-batch is the following:

BatchConfiguration class:

@Profile("!test")
@Configuration
@EnableBatchProcessing
public class BatchJobConfiguration {

@Autowired
private JobBuilderFactory jobBuilderFactory; 

[...]

@Bean
public Step step01() {
    return stepChargementFoliosBuilder().buildStepChargement(); 
}

@Bean
public Step step02() {
    return stepChargementPretsBuilder().buildStepChargement(); 
}

@Bean
public Step step03() {      
    return stepChargementGarantiesBuilder().buildStepChargement(); 
}

@Bean
public Job job() {
    return jobBuilderFactory.get("Spring Batch Job: chargement_donnees_SEM")
        .incrementer(new JobParametersIncrementer() {

            @Override
            public JobParameters getNext(JobParameters parameters) {
                return new JobParametersBuilder().addLong("time", System.currentTimeMillis()).toJobParameters();
            }
        })
        .flow(step01())
        .on("COMPLETED").to(step02())
        .on("COMPLETED").to(step03())
        .end()
        .build(); 
}

@Primary
@Bean
public BatchConfigurer batchConfigurer(@Qualifier(JPAConfiguration.METADATA_DATASOURCE) DataSource datasource) {
    return new DefaultBatchConfigurer(datasource);
}

Task configuration class:

@Profile("!test")
@Configuration
@EnableTask
public class TaskConfiguration {

@Bean
public TaskRepositoryInitializer taskRepositoryInitializer(@Qualifier(JPAConfiguration.METADATA_DATASOURCE) DataSource datasource) {
    TaskRepositoryInitializer initializer = new TaskRepositoryInitializer(); 
    initializer.setDataSource(datasource);

    return initializer; 
}

@Bean
public TaskConfigurer taskConfigurer(@Qualifier(JPAConfiguration.METADATA_DATASOURCE) DataSource datasource) {
    return new DefaultTaskConfigurer(datasource); 
} 

Finally, here is the JPAConfiguration class:

@Profile("!test")
@Configuration
@EnableTransactionManagement
@EnableJpaRepositories (
basePackages = "com.desjardins.parcourshabitation.chargerprets.repository", 
    entityManagerFactoryRef = JPAConfiguration.BUSINESS_ENTITYMANAGER, 
    transactionManagerRef = JPAConfiguration.BUSINESS_TRANSACTION_MANAGER 
)
public class JPAConfiguration {

public static final String METADATA_DATASOURCE = "metadataDatasource";
public static final String BUSINESS_DATASOURCE = "businessDatasource"; 
public static final String BUSINESS_ENTITYMANAGER = "businessEntityManager"; 
public static final String BUSINESS_TRANSACTION_MANAGER = "businessTransactionManager"; 

@Primary
@Bean(name=METADATA_DATASOURCE)
public DataSource scdfDatasource() {
    return new DatasourceBuilder("scdf-mysql").buildDatasource(); 
}

@Bean(name=BUSINESS_DATASOURCE)
public DataSource pretsDatasource() {
    return new DatasourceBuilder("sem-mysql").buildDatasource(); 
}

@Bean(name=BUSINESS_ENTITYMANAGER)
public LocalContainerEntityManagerFactoryBean businessEntityManager(EntityManagerFactoryBuilder builder, @Qualifier(BUSINESS_DATASOURCE) DataSource dataSource) {

    return builder
        .dataSource(dataSource)
        .packages("com.desjardins.parcourshabitation.chargerprets.domaine")
        .build(); 
}

@Bean(name = BUSINESS_TRANSACTION_MANAGER)
public PlatformTransactionManager businessTransactionManager(@Qualifier(BUSINESS_ENTITYMANAGER) EntityManagerFactory entityManagerFactory) {
    return new JpaTransactionManager(entityManagerFactory);
}

Versions used:

  • Composed-task-runner: 1.0.0.RELEASE
  • Spring-cloud-task: 1.2.2.RELEASE

I have tried launching the composed-task with the interval-time-between-checks property set differently, yet this has not be conclusive.

I have uploaded a GitHub repository with a minimalistic version of the code, with instructions on how to reproduce in the readme file: https://github.com/JLauzonG/deadlock-bug-stackoverflow

Any clues how to solve this ?

1
Thanks for the detailed write-up! Is it possible to share a simplified sample in a GH repo that reproduces the problem? That will help us to retry it on our side more easily.Sabby Anandan
I have created a minimalistic version of our task and it's now uploaded here: github.com/JLauzonG/deadlock-bug-stackoverflow. The instructions to reproduce this bug are in the readme file...Jeremy L-G
Hi, @jeremy-l-g. Thanks for taking the time to share the sample. This is not the solution, but I wanted to share my findings. 1) There's no option that I'm aware to supply 2-datasources to a composed-task-runner (CTR). It is not designed to handle >1 datasource. The java-buildpack's autoreconiguration will fail, and it will skip connecting to either of the DBs. Instead, it will use the H2 database by default. Because the composed-task and SCDF will be on different DBs, the child-task execution will fail unable to find the task-id, which is persisted in SCDF's TaskRepository.Sabby Anandan
2) To somehow run the sample (to reproduce the problem), I changed the code to make both the datasources and SCDF configured to connect to the same DB. With that, I was able to run the composed-task (e.g., task create lockzz --definition "d1: dlock && d2: dlock") from SCDF on PWS. I didn't see the deadlock error - I re-launched the composed-task a few times, and it did its job and shut down the containers at the end as expected.Sabby Anandan
3) The ideal approach to applying multiple datasources would be when we standardize the way SCDF accepts deployment properties for composed-task-runner. We currently have spring-cloud/spring-cloud-dataflow#1717, which addresses the model in which we would propagate the explicit binding to child-tasks.Sabby Anandan

1 Answers

0
votes

In every child task's manifest, I have set the second database instance to be bound when deployed. However, when SCDF deploys these tasks, the manifest defined services are disregarded. I have to manually bind a second database to each child task once SCDF has initially deployed them on PCF. If I bind two DB instances to the server environment variable, CTR will inherit and effectively, it will fail, which is, I believe, not an option.