0
votes

I am not able to finalize whether Spring Batch framework is applicable for the below requirement. I need experts inputs on this.

Following is my requirement:

Read multiple Oracle tables (at least 10 tables including both transaction and master), do complex calculation based on the business rules, Insert / Update / Delete records in transaction tables.

I have identified the following two designs:

Design # 1:

ItemReader: Select eligible records from Key transaction table.

ItemProcessor: Fetch additional details from DB using the key available in the record retrieved by ItemReader.(It would require multipble DB transactions) Do the validation and computation and add the details to be written to DB as objects in a list.

ItemWriter: Write the details available in objects using CustomItemWriter(insert / update / delete operation)

With this design, we can achieve parallel processing but increase the number of DB transactions.

Design # 2:

Step # 1

ItemReader: Use Composite Item Reader (Group of ItemReaders) to read all the required tables.

ItemWriter: Save the result sets as lists of Objects (One list per table) in execution context

Step # 2

ItemReader: Retrieve lists of Objects available in execution context and group them into one list of objects based on the business processing so that processor can process them.

IremProcessor: Process the chunk of Objects returned by ItemReader. Do the validation and computation and add the details to be written to DB as objects in a list.

ItemWriter: Write the details available in objects using CustomItemWriter(insert / update / delete operation)

With this design, we can REDUCE the number of DB Transactions but we are delaying the processing till all table records are retrieved and stored in execution context ie we are not using parallel processing provided by SpringBatch.

Please advise whether the above is feasible using SpringBatch or we need to use conventional Java program.

2
Don't store everything in the execution context as that is serialized to the the storage used to store the execution details. Next to that reading everything into memory isn't really smart as you eventually will run into memory issues. What is wrong with more tranactions as that gives you also more control and the possibility to restart from a certain point (ie where it failed). - M. Deinum
We will be having an OLTP database which will be used by Web application(s) and these kind of batches. These batches need to update transaction tables used by Web application(s) more frequently. As we cannot control Online transactions, we are planning to reduce the number of batch transactions to avoid OLTP DB overload. - Vijay
The transactions should be small enough to not cause (that much) trouble for your OLTP. If you do a long running transaction you get a lock for a very large amount of time, that is guaranteed to trouble your online process. - M. Deinum

2 Answers

0
votes

From my understanding, Spring batch has nothing to do with database batch operations (or at least the word 'batch' has a different meaning in these two contexts..) Spring batch is used to create processes with multiple steps, and gives you the chance to restart a process if one of the process steps fails (without repeating the previously finished process steps.)

0
votes

The good news is that your problem description matches a very common use case for spring-batch. The bad news is that the problem description is too generic to allow much meaningful input about the specifc design beyond the comments already provided.

Spring-batch brings facilities similar to JCL and ISPF from the mainframe world into the java context.

Spring batch provides a framework for organizing and managing the boundaries of your process. It is a natural for a lot of ETL and bigdata operations, but it is not the only way to write these processes.

If you process can be broken down into discreet steps, then spring batch is a good choice for you.

The Itemreader should (logicall) be an iterator returning a single object representing the start of one logical unit of work (luw). The luw object is captured by the chunker and assembled into collections of the size you configure, and then passed to the processor. The result of the processor is then passed to the writer. In the context of an RDBMS centric process, the commit happens at the end of the writer's operation.

What happens in each of those pieces of the step is 100% whatever you need (plain old java). The point of the framework is to free you from the complexity and enable you to solve the problem.