Consider a Step bean:
@Bean
public Step stepForChunkProcessing() {
return stepBuilderFactory
.get("stepForChunkProcessing")
.<Entity1, Entity2>chunk(1000)
.reader(reader())
.processor(processor())
.writer(writer())
.taskExecutor(taskExecutor())
.throttleLimit(10)
.build();
}
//@formatter:on
@Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("MyApplication");
}
Requirement: In Reader, it reads from records (of Entity1) from a File. In Processor, it processes and in Writer, it writes into the database.
Before TaskExecutor, Only one thread was created and it would loop around in Reader and Processor for 1000 times as defined in chunk setting above. Then it would move to writer and writes all the 1000 records. Again it would start from record number 1001 and then process another 1000 records in Reader and Processor. This is an synchronize execution.
After TaskExecutor and the throttle limit as 10, 10 threads were created independent to each other. How will they maintain the number of records from the file that are already processed by other threads? Also consider if I give synchronized keyword in the Read method of the reader, still how come the different threads will keep a check on already processed records from the file?