TL;DR: How should one create Spring Batch Jobs using Spring Batch Job? Transaction boundaries seem to be the problem. This seems to be a classic question but here it goes again:
I have following use case: I need to poll a FTP server and store found XML files as a blob in database. XML has 0...N entries of interest I need to send to the external Web Service and store the response. Responses can be non-retryable or retryable and I need to store each request and their responses for auditing purposes.
The domain/JPA model is as follows: Batch (contains XML blob) contains 0-N BatchRow objects. BatchRow contains data to be sent to the web service and it also contains 1...N BatchRowHistory objects holding status information about web service calls.
I was asked to implement this using Spring Batch (Spring Integration could've been other possibility since this case of integration). Now I've struggled with different approaches and I find this task much more complex and therefore difficult as it IMHO should be.
I've split the tasks to following jobs:
Job1:
Step11: Fetch file and store to the database as a blob.
Step12: Split XML to entries and store those entries to db.
Step13: Create Job2 and launch it for each entry stored in Step12. Mark Job2 created flag up in the domain model database for entries.
Job2:
- Step21: Call web service for each entry and store result to db. Retry and skip logic dwells here. Job2 types need possibly manual restarting etc.
The logic behind this structure is that Job1 is run periodically scheduled (once a minute or so). Job2 is run whenever there are those Jobs and they have either succeeded or their retry limit is up and they have failed. Domain model stores basically only results and Spring Batch is responsible for running the show. Manual relaunches etc can be handled via Spring Batch Admin (at least I hope so). Also Job2 has the BatchRow's id in the JobParameters map so it can be viewed in Spring Batch Admin.
Question 1: Does this job structure make sense? I.e. creating new Spring Batch Jobs for each row in db, it kind of seems to defeat the purpose and re-invent the wheel at some level?
Question 2: How do I create those Job2 entries in Step13?
I got first problems with transaction and JobRepository but succeeded to launch few jobs with following setup:
<batch:step id="Step13" parent="stepParent">
<batch:tasklet>
<batch:transaction-attributes propagation="NEVER"/>
<batch:chunk reader="rowsWithoutJobReader" processor="batchJobCreator" writer="itemWriter"
commit-interval="10" />
</batch:tasklet>
</batch:step>
<bean id="stepParent" class="org.springframework.batch.core.step.item.FaultTolerantStepFactoryBean" abstract="true"/>
Please note that commit-interval="10" means this can create up to 10 jobs currently and that's it... because batchJobCreator calls JobLauncher.run method and it goes swimmingly BUT itemWriter can not write BatchRows back to the database with updated information (boolean jobCreated flag toggled on). Obvious reason for that is the propagation.NEVER in transaction-attributes, but without it I can't create jobs with jobLauncher.
Because updates are not passed to the database, I get the same BatchRows again and they clutter the log with:
org.springframework.batch.retry.RetryException: Non-skippable exception in recoverer while processing; nested exception is org.springframework.batch.core.repository.JobExecutionAlreadyRunningException: A job execution for this job is already running: JobInstance: id=1, version=0, JobParameters=[{batchRowId=71}], Job=[foo.bar]
at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$2.recover(FaultTolerantChunkProcessor.java:278)
at org.springframework.batch.retry.support.RetryTemplate.handleRetryExhausted(RetryTemplate.java:420)
at org.springframework.batch.retry.support.RetryTemplate.doExecute(RetryTemplate.java:289)
at org.springframework.batch.retry.support.RetryTemplate.execute(RetryTemplate.java:187)
at org.springframework.batch.core.step.item.BatchRetryTemplate.execute(BatchRetryTemplate.java:215)
at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor.transform(FaultTolerantChunkProcessor.java:287)
at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:190)
at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:74)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:386)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:250)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:293)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:120)
at java.lang.Thread.run(Thread.java:680)
That means that job has already been created in Spring Batch and it tries to create those files again on later executions of Step13. I could circumvent this setting the jobCreated flag to true in the Job2/Step21 but it feels kind of kludgy and wrong to me.
Question 3: I had more domain object driven approach; I had Spring Batch Jobs scanning domain tables using pretty elaborate JPQL queries and JPAItemReaders. The problem with this approach is that this does not use Spring Batch's finer features. The history and retry logic are the problem. I need to code the retry logic to the JPQL queries directly (for example, if BatchRow has more than 3 BatchRowHistory elements it has failed and needs to be manually re-examined). Should I bite the bullet and continue with this approach instead of trying to create individual Spring Batch Job for each web service call?
Software info if needed: Spring Batch 2.1.9, Hibernate 4.1.2, Spring 3.1.2, Java 6.
Thank you in advance and sorry for the long story, Timo
Edit 1: The reason why I think I need to spawn new jobs is this:
Loop while reader returns null OR exception is thrown
Transaction start
reader - processor - writer loop for the whole N rows
Transaction end for batch size N
Each failed entry is the problem; I want manually restartable executions (Jobs are the only ones that are restartable in the Spring Batch Admin, right?) for each row in the batch so that I could use Spring Batch Admin to view failed jobs (with their job parameters which contain row ids from domain db) and restart those etc. How do I accomplish this kind of behaviour without spawning jobs and storing the history to the domain db?