I am running around 18.000 spring jobs in parallel, each with one step. Each step consists of reading from a file, converting and manipulating those values and writing them to a Mongo and MySql database, nothing unusual. After all of the jobs finished, memory consumption stays at 20GB USED and stays there. I construct my spring batch members as follows:
@Autowired
public ArchiveImportManager(final JobRepository jobRepository, final BlobStorageConfiguration blobConfiguration,
final JobBuilderFactory jobBuilderFactory, final StepBuilderFactory stepBuilderFactory,
final ArchiveImportSettings settings) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
this.jobLauncher = new SimpleJobLauncher();
final ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setCorePoolSize(THREAD_POOL_SIZE);
threadPoolTaskExecutor.setMaxPoolSize(THREAD_POOL_SIZE);
threadPoolTaskExecutor.setQueueCapacity(THREAD_POOL_QUEUE);
threadPoolTaskExecutor.initialize();
this.jobLauncher.setTaskExecutor(threadPoolTaskExecutor);
this.jobLauncher.setJobRepository(jobRepository);
}
I create one job as follows:
private Job createImportJob(final ArchiveResource archiveResource, final int current, final int archiveSize) {
final String name = "ImportArchiveJob[" + current + "|" + archiveSize + "]"
+ new Date(System.currentTimeMillis());
final Step step = this.stepBuilderFactory
.get(name)
.<ArchiveResource, ArchiveImportSaveData> chunk(1)
.reader(getReader(archiveResource, current, archiveSize))
.processor(getProcessor(current, archiveSize))
.writer(getWriter(current, archiveSize))
.build();
return this.jobBuilderFactory
.get(name)
.flow(step)
.end()
.build();
}
And start all jobs in a loop:
private void startImportJobs(final List<ArchiveResource> archives) {
final int size = archives.size();
for (int i = 0; i < size; i++) {
final ArchiveResource ar = archives.get(i);
final Job j = createImportJob(ar, i, size);
try {
this.jobLauncher.run(j, new JobParametersBuilder()
.addDate("startDate", new Date(System.currentTimeMillis()))
.addString("progress", "[" + i + "|" + size + "]")
.toJobParameters());
} catch (final JobExecutionAlreadyRunningException e) {
log.info("Already running", e);
} catch (final JobRestartException e) {
log.info("Restarted", e);
} catch (final JobInstanceAlreadyCompleteException e) {
log.info("ALready completed", e);
} catch (final JobParametersInvalidException e) {
log.info("Parameters invalid", e);
}
}
}
Do I have to release the memory somehow or delete the jobs after they finished or something? I do not understand why memory consumption stays that high.
Best regards