7
votes

I have a spring batch job which is expected to process 'N' job-ids sequentially, based on FIFO. There are 5 steps in this spring batch job.
We use DECIDER to determine any more job-id is present.If yes, go to the first step and run all the steps for that job-id.
I see "duplicate step" message in the log emitted by spring-batch, which appears to be fine until and unless the step in the first job (say job-id=1) gets an UNKNOWN state. In such event, the same step for second job (job-id =2) fails to start stating "Step is in UNKNOWN state, it is dangerous to restart....". Is there a better approach to define spring-batch job to process 'N' job-ids.

There is a table which holds the job information. Each Job places orders in to Order table. It is possible that two jobs needs to be processed on the same day. Job can insert/update the same order number having same revision(with difference in other details) or different revision of same order number. The batch program must process these jobs in the FIFO model based on success_time in the job table.

Assume table structure as below

Job_Id      job_name    success_time
1           job1        2014-09-29 10:00:00
2           job2        2014-09-29 13:00:00

Order_id    order_number    order_revision  order_details   job_id
1           ABC             1               Test1            1
2           XYZ             1               Test2            1
3           ABC             2               Test1-Rev2       2

Sample configuration is shown below. For brevity, I have removed metadata definitions and reused the reader and writer.

<batch:step id="abstractParentStep" abstract="true">
    <batch:tasklet>
        <batch:chunk commit-interval="100" />
    </batch:tasklet>
</batch:step>

<-- Using same reader and writer to simplify scenario depiction --> 
<batch:job id="OrderProcessingJob">
    <batch:step id="Collect-Statistics-From-Staging-Tables" next="Validate-Order-Mandatory-Fields" parent="abstractParentStep">
        <batch:tasklet>
            <batch:chunk reader="orderReader" writer="orderWriter" />
        </batch:tasklet>
    </batch:step>
    <batch:step id="Validate-Order-Mandatory-Fields" next="Validate-Item-Mandatory-Fields" parent="abstractParentStep">
        <batch:tasklet>
            <batch:chunk reader="orderReader" writer="orderWriter" />
        </batch:tasklet>
    </batch:step>
    <batch:step id="Validate-Item-Mandatory-Fields" next="decision" parent="abstractParentStep">
        <batch:tasklet>
            <batch:chunk reader="orderReader" writer="orderWriter" />
        </batch:tasklet>
    </batch:step>
    <batch:decision id="decision" decider="processMoreJobsDecider">
        <batch:next on="REPEAT" to="Validate-Order-Mandatory-Fields" />
        <batch:end on="COMPLETED" />
    </batch:decision>

</batch:job>

In the first step, we would check how many jobs (count) needs to be processed and places that in to ExecutionContext. In the decider, we check if the total no of jobs processed matches the count and returns REPEAT status if there are more job_ids to process.

We ran into exception as mentioned above when the first job's step remained in UNKNOWN state and second job (since decider decided there is one more job_id to process) got the exception message as shown above.

2
I'm getting the same message in the log, using a similar setup, with a Decider jumping to a previous step.Duplicate step [setCurrentPartyType] detected in execution of job=[i82.job]. If either step fails, both will be executed again on restart. - org.springframework.batch.core.job.SimpleStepHandler @ 112danidemi

2 Answers

3
votes

You should give each step a unique name. If you use partitioning, this is done for you automatically.

See this gist, file partitionedSimple.groovy (you can run all the examples just by downloading the files and running groovy <filename.groovy>). In step1, we determine the number of steps we'll need subsequently (there hardcoded to 3) and save it in the job context (first in the step context and then we promote). The we create a partitioned step partitionedStep, which will launch 3 steps. Their name will be repeatedStep:<partition name>. In the partition, we also put a key named partitionIndex in the context, so we can retrieve it in the tasklet where we implement the repeated step.

Then we run a example where we force it to fail when it's processing item 2. We get these step executions:

Status is: FAILED
Step executions: 
  1: step1 
  2: partitionedStep FAILED
  4: repeatedStep:partition_1 
  5: repeatedStep:partition_2 FAILED
  3: repeatedStep:partition_3 

If we then restart this job and remove the error triggering, only the second item will be processed:

Status is: COMPLETED
Step executions: 
  6: partitionedStep 
  null: repeatedStep:partition_1 STARTING
  7: repeatedStep:partition_2 
  null: repeatedStep:partition_3 STARTING

I also added a slightly more complicated example where the repeated step is actually a flow step and where the step names are dynamically generated by hand -- this is important if you want to repeat a flow, as you'll have to give unique names to the steps inside each execution of the flow.

This can also be done without partitioning, with a looping decider. The idea here is that you have a wrapping step that repeats (allowStartIfComplete) and wraps a flow with your desired steps. These steps are created on-demand thanks to the step scoped bean factories. The reason for the seemingly redundant wrapping step is that the flow builder inside the job() bean factory needs to know step names ahead of time to build the transition states, so we "hide" the at that point unknown step names inside another step. Maybe there's a way to simplify it. The executions for the first run are:

Step executions: 
  1: step1 
  2: wrappingStep 
  3: repeated-1 
  4: wrappingStep FAILED
  5: repeated-2 FAILED

(notice repeated-3 is never executed)

and on the second run:

Step executions: 
  6: wrappingStep 
  7: wrappingStep 
  8: repeated-2 
  9: wrappingStep 
  10: repeated-3
1
votes

Your problem is that you start your flow with a 'next' instead of a start.

I use Java config rather than XML, but got a similar exception (not particularly helpful error output) with:

@Bean
public Flow insertGbDatabaseRecordsFlow(final Step populateFpSettlementsStep, final Step populateGbDatabaseStep) {
    FlowBuilder<Flow> flowBuilder = new FlowBuilder<>("insertGbDatabaseRecordsFlow");
    flowBuilder.next(populateFpSettlementsStep);
    flowBuilder.next(populateGbDatabaseStep);
    return flowBuilder.build();
}

The fix was the first next -> start

@Bean
public Flow insertGbDatabaseRecordsFlow(final Step populateFpSettlementsStep, final Step populateGbDatabaseStep) {
    FlowBuilder<Flow> flowBuilder = new FlowBuilder<>("insertGbDatabaseRecordsFlow");
    flowBuilder.start(populateFpSettlementsStep);
    flowBuilder.next(populateGbDatabaseStep);
    return flowBuilder.build();
}

presumably the same applies for Spring Batch xml config.