3
votes

I have several Spring Batch (2.1.9.RELEASE) jobs running in production that use org.springframework.batch.core.launch.support.RunIdIncrementer.

Sporadically, I get the following error:

org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters={run.id=23, tenant.code=XXX}.  If you want to run this job again, change the parameters.
    at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:122) ~[spring-batch-core-2.1.9.RELEASE.jar:na]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_39]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) ~[na:1.6.0_39]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) ~[na:1.6.0_39]
    at java.lang.reflect.Method.invoke(Method.java:597) ~[na:1.6.0_39]
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:318) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110) ~[spring-tx-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.batch.core.repository.support.AbstractJobRepositoryFactoryBean$1.invoke(AbstractJobRepositoryFactoryBean.java:168) ~[spring-batch-core-2.1.9.RELEASE.jar:na]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202) ~[spring-aop-3.1.1.RELEASE.jar:3.1.1.RELEASE]
    at sun.proxy.$Proxy64.createJobExecution(Unknown Source) ~[na:na]
    at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:111) ~[spring-batch-core-2.1.9.RELEASE.jar:na]
    at org.springframework.batch.core.launch.support.CommandLineJobRunner.start(CommandLineJobRunner.java:349) [spring-batch-core-2.1.9.RELEASE.jar:na]
    at org.springframework.batch.core.launch.support.CommandLineJobRunner.main(CommandLineJobRunner.java:574) [spring-batch-core-2.1.9.RELEASE.jar:na]
    at (omitted for brevity)

A sampling from the various XML contexts:

<bean
    id="jobParametersIncrementer"
    class="org.springframework.batch.core.launch.support.RunIdIncrementer" />

<batch:job id="rootJob"
    abstract="true"
    restartable="true">
    <batch:validator>
        <bean class="org.springframework.batch.core.job.DefaultJobParametersValidator">
            <property name="requiredKeys" value="tenant.code"/>
        </bean>
    </batch:validator>
</batch:job>

<batch:job id="rootJobWithIncrementer"
    abstract="true"
    parent="rootJob"
    incrementer="jobParametersIncrementer" />

I use org.springframework.batch.core.launch.support.CommandLineJobRunner to execute the job:

java org.springframework.batch.core.launch.support.CommandLineJobRunner /com/XXX/job123/job123-context.xml job123 tenant.code=XXX -next 

All of the jobs (that use the incrementer) have rootJobWithIncrementer as parent.

I did quite a bit of research and found that some who got this error had success changing the isolation level of the transaction manager. I fiddled with several levels, finally arriving at READ_COMMITED.

<batch:job-repository
    id="jobRepository"
    data-source="oracle_hmp"
    transaction-manager="dataSourceTransactionManager"
    isolation-level-for-create="READ_COMMITTED"/>

Based on my understanding, this type of error should only happen if the same job is executed at the same time from multiple processes -- so that there might be contention for the incrementer. In this instance, that is not the case, yet we see the error.

Any ideas as to what might be causing this problem? Should I try a different isolation level? Something else?

Thanks!

There is a similar question here, but it is not as well documented (and also lacks and answer).

1
Did you try default isolation level as mentioned here? Can you upgrade to 2.2.1.RELEASE (but changelog doesn't contains issues about incrementer)? - Luca Basso Ricci
I tried several of the isolation levels, including the default - SERIALIZABLE. I will attempt the upgrade - our codebase is currently using Spring 3.1.X, but acording the the project POM, Spring Batch 2.2.1 requires Spring 3.2.X - I am somewhat concerned about an upgrade, but I will give it a shot. Thanks! - Greg

1 Answers

1
votes

This might be a long shot but it took me a long time to figure it out because the only symptom was sporadically getting the JobInstanceAlreadyCompleteException as you describe so I figured I'd suggest it.

The database I was using was Oracle and the BATCH_JOB_SEQ and BATCH_JOB_EXECUTION_SEQ I had created both had a CACHE_SIZE of 10.

This had the effect of sometimes causing the JOB_INSTANCE_ID and JOB_EXECUTION_ID to not be ordered correctly. Spring batch makes the assumption that the most recent JOB_INSTANCE is the one with max(JOB_INSTANCE_ID) (see org.springframework.batch.core.repository.dao.JdbcJobInstanceDao.FIND_LAST_JOBS_BY_NAME). Since my sequence was sometimes thrown off, this assumption did not always hold true.

I fixed it by setting the sequences to NO_CACHE.

An easy way to tell if this might be your problem is to check if your sequences are set to CACHE at all and/or to make sure that your JOB_INSTANCE_ID and JOB_EXECUTION_ID are always ascending with each new run.