0
votes

I've a Spring Batch Job that runs on a daily basis and has around 100k records to process. I've configured my batch as below.

ItemReader : I've used JdbcCursorItemReader that reads data from a single table(This table has all the source records). Chunk size is 1000

ItemProcessor : Here I've added logic to perform validation for every record. Validation includes checking the data for its correctness and once validations are complete I've to verify few more tables(for this record).

ItemWriter : Here I've updated final tables based on the validation results.(This is a bulk operation and I've used JdbcTemplate.batchUpdate for faster processing).

Results : For processing 104000 records job took around 140 min. Since this is run on a daily basis and many other jobs are running parallely in production I want to enhance the performance of this batch.

Can someone suggest a better way to enhance this batch? (I've tried multithreaded approach provided by spring batch using taskexecutor in step config but I've got some cursor issues in reader as below)

**Caused by: org.springframework.dao.InvalidDataAccessResourceUsageException: Unexpected cursor position change.
at org.springframework.batch.item.database.AbstractCursorItemReader.verifyCursorPosition(AbstractCursorItemReader.java:368)
at org.springframework.batch.item.database.AbstractCursorItemReader.doRead(AbstractCursorItemReader.java:452)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:88)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)
at org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)**

Screenshot of CPU sample inside ItemProcessor

1

1 Answers

0
votes

use JVisualVm to monitor the bottlenecks inside your application. Since you said "for processing 104000 records job took around 140 min", you will get better insights of where you are getting performance hits.

VisualVm tutorial

Open visualvm connect your application => sampler => cpu => CPU Samples. Take snapshot at various times and analyse where is it taking much time. By checking this only you will get enough data for optimisation.

Note: JvisualVm comes under oracle jdk 8 distribution. you can simply type jvisualvm on command prompt/terminal. if not download from here