I have developed a Spring Batch application that works fine with single thread. Its a simple batch application that reads a csv file using FlatFileItemReader outputs a POJO CSVLineMapper, does simple processing and then write the POJO to a repository.
Now I make the application multithreaded using ThreadPoolTaskExecutor. To test the error handling by the framework, I throw RuntimeException for a specific record in the processor expecting only hte specific thread to get terminated and skipping only that chunk in which error was thrown. But the application terminates after the error writing only 15 records. why? Am I doing something wrong?
As restartability is not supported with multithreading, how do we design a multi-threaded spring batch application such that only problematic record is skipped and the application continues processing hte other records without terminating.
Please find the code snippet used below :
public Step load(){
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setMaxPoolSize(5);
threadPoolTaskExecutor.setCorePoolSize(5);
threadPoolTaskExecutor.afterPropertiesSet();
stepBuilderFactory.get("load")
.chunk(5)
.reader(reader1())
.processor(processor())
.writer(writer())
.taskExecutor(threadPoolTaskExecutor)
.listener(stepExecutionListener)
.listener(processListener())
.listener(writeListener)
.build()
}
reader1 is FlatFileItemReader that has setSaveState to false.
One more observation is that the log in the reader is called only once in the complete flow which is called my the main thread. But the processor and writer are called by diff threads of ThreadPoolTaskExecutor. Why? The reader doesnot implement ItemReader, but processor and writer implement ItemProcessor and ItemWriter respectively in my case.