1
votes

In Spring Batch, how to loop the reader,processor and writer for N times?

My requirement is:

I have "N" no of. customers/clients. For each customer/client, I need to fetch the records from database (Reader), then I have to process (Processor) all records for the customer/client and then I have to write the records into a file (Writer).

How to loop the spring batch job for N times?

2
What have you tried so far? Can you post your code/config files? - araknoid
I did for one customer. So I want to find a way to loop it for N customers.<batch:job id="reportBatchJob"> <batch:step id="Step1"> <batch:tasklet> <batch:chunk reader="reportReader" processor="reportProcessor" writer="reportWriter" commit-interval="100"> </batch:chunk> </batch:tasklet> </batch:step> </batch:job> - rockstar8080

2 Answers

0
votes

AFAIK I'm afraid there's no framework support for this scenario. Not at least the way you want to solve it. I'd suggest to solve the problem differently:

Option 1

Read/Process/Write all records from all customers at once.You can only do this if they are all in the same DB. I would not recommend it otherwise, because you'll have to configure JTA/XA transactions and it's not worth the trouble.

Option 2

Run your job once for each client (best option in my opinion). Save necessary info of each client in different properties files (db data connections, values to filter records by client, whatever other data you may need specific to a client) and pass through a param to the job with the client it has to use. This way you can control which client is processed and when using bash files and/or cron. If you use Spring Boot + Spring Batch you can store the client configuration in profiles (application-clientX.properties) and run the process like:

$>  java -Dspring.profiles.active="clientX"  \
     -jar "yourBatch-1.0.0-SNAPSHOT.jar"     \
     -next

Bonus - Option 3

If none of the abobe fits your needs or you insist in solving the problem they way you presented, then you can dynamically configure the job depending on parameters and creating one step for each client using JavaConf:

@Bean
public Job job(){
    JobBuilder jb = jobBuilders.get("job");
    for(Client c : clientsToProcess) {
            jb.flow(buildStepByClient(c));
    };
    return jb.build();
}

Again, I strongly advise you not to go this way: ugly, against framework philosophy, hard to maintain, debug, you'll probably have to also use JTA/XA here, ...

I hope I've been of any help!

0
votes

Local Partitioning will solve your problem.

In your partitioner, you will put all of your clients Ids in map as shown below ( just pseudo code ) ,

public class PartitionByClient implements Partitioner {

        @Override
        public Map<String, ExecutionContext> partition(int gridSize) {
            Map<String, ExecutionContext> result = new HashMap<>();
            int partitionNumber = 1;
            for (String client: allClients) {
            ExecutionContext value = new ExecutionContext();
            value.putString("client", client);
            result.put("Client [" + client+ "] : THREAD " + partitionNumber, value);
            partitionNumber++;
            }

        } 

        return result;
        }
    }

This is just a pseudo code. You have to look to detailed documentation of partitioning.

You will have to mark your reader , processor and writer in @StepScope ( i.e. which ever part needs the value of your client ). Reader will use this client in WHERE clause of SQL. You will use @Value("#{stepExecutionContext[client]}") String client in reader etc definition to inject this value.

Now final piece , you will need a task executor and clients equal to concurrencyLimit will start in parallel provided you set this task executor in your master partitioner step configuration.

@Bean
    public TaskExecutor taskExecutor() {
    SimpleAsyncTaskExecutor simpleTaskExecutor = new SimpleAsyncTaskExecutor();
    simpleTaskExecutor.setConcurrencyLimit(concurrencyLimit);
    return simpleTaskExecutor;
    }

concurrencyLimit will be 1 if you wish only one client running at a time.