Spring batch state when steps fails

Question

I'm trying out spring batch. I have seen many examples when running jobs via ItemReader and ItemWriter. If a job runs without errors there is no problem. But I haven't found out how to handle state when an job fails after processing a number of records.

My scenario is realy simple. Read records from a xml file (ItemReader) and call an external system for storing (ItemWriter). So what happens if the external system is not available in the middle of the process and after a while the job status is set to FAILED? If I restart the job again manually the next day when the external system is up and running I will get duplicates for the previously loaded records.

In some way I must have information for skipping the already loaded records. I have tried to store a cursor via the ExecutionContext but when I restart the job I get a new JOB_EXECUTION_ID and the cursor data is lost because a get a new line in the BATCH_STEP_EXECUTION_CONTEXT.SHORT_CONTEXT. The BATCH_STEP_EXECUTION.COMMIT_COUNT and BATCH_STEP_EXECUTION.READ_COUNT is also reset when do restart.

I restart the job by using the JobOperator: jobOperator.restart(jobExecutionId);

Is there a way of restart a job without get a new jobExecutionId or alterntive way of get state of failing jobs. If someone have found (can provide) an example contaning state and error handling I would be happy.

One alternative solution is of course to create my own table that keeps tracks of processed records but I hope really that the framework has a mechanism for this. Otherwise I don't understan the idea with spring-batch.

Regards Mats

Michael Minella Michael Minella · Accepted Answer · 2014-12-30T16:10:34

One of the primary features Spring Batch provides is the persistence of the state of a job in the job repository. When a job fails, upon restart, the default behavior is for the job to restart at the step that failed (skipping the steps that have already been successfully completed). Within a chunk based step, most of our readers (the StaxEventItemReader included) store what records have been processed in the job repository (specifically within the ExecutionContext). By default, when a chunk based step fails, it's restarted at the chunk that failed last time, skipping the successfully processed chunks.

An example of all of this would be if you had a three step job:

<job id="job1">
    <step id="step1" next="step2">
        <tasklet>
            <chunk reader="reader1" writer="writer1" commit-interval="10"/>
        </tasklet>
    </step>
    <step id="step2" next="step3">
        <tasklet>
            <chunk reader="reader2" writer="writer2" commit-interval="10"/>
        </tasklet>
    </step>
    <step id="step3">
        <tasklet>
            <chunk reader="reader3" writer="writer3" commit-interval="10"/>
        </tasklet>
    </step>
</job>

And let's say this job completes step1, then step2 has 1000 records to process but fails at record 507. The chunk that consists of records 500-510 would roll back and the job would be marked as failed. The restart of that job would skip step1, skip records 1-499 in step2 and start back at record 500 of step2 (assuming you're using stateful item readers).

With regards to the jobExecutionId on a restart, Spring Batch has the concept of a job instance (a logical run) and a job execution (a physical run). For a job that runs daily, the logical run would be the Monday run, the Tuesday run, etc. Each of these would consist of their own JobInstance. If the job is successful, the JobInstance would end up with only one JobExecution associated with it. If it failed and was re-run, a new JobExecution would be created for each of the times the job is restarted.

You can read more about error handling in general and specific scenarios in the Spring Batch documentation found here: http://docs.spring.io/spring-batch/trunk/reference/html/index.html

Spring batch state when steps fails

1 Answers