1
votes

I'm working on a project using Spring Batch. In this project, I use Spring Batch late binding in which I injected a parameter (a string that will be used as a condition for SQL reader query) using JobParameters. Currently, I'm using the @StepScope for that late binding and everything works perfectly fine.

What I'm asking here is when to use @StepScope and when to use @JobScope. I've read the Spring Batch reference and I've googled about StepScope and JobScope. All I'm getting is that :

a. StepScope : Spring Batch will use the spring container to instantiate a new instance of that component for each step execution.

b. JobScope : There will be only one instance per executing job.

I just can't determine the consideration of when to use StepScope or JobScope. Can someone explain a little deeper?

2

2 Answers

1
votes

A step is composed of a read, transform/process, and write stage, the latter is per chunk with retry / rollback complications through the process stage, and the read stage is usually no-rollback no-retry. A Job is composed of as many steps, each doing that, as you like. So a step scope bean is the same instance for each read/process/write phase and listeners of a given step. Job scope is the same instance for all steps in a job.

So, if you need to use the same listener for some processing in multiple steps of a job (you have a step to convert data to an intermediary format and validate, then a step to process all data to your database and you want the same listener to do some async auditing process somewhere) then you would job scope that listener and register it against both steps in the job. This way, each step would see the same instance of the object behind the proxy and the same methods on the same instance would be called for things like "on read error" or "after write" or "after step" (depending if you're using annotation based listener or interface based listener and what you're listening for)

Your reader, for example, is / should only be used by one step at any time, so having things like your readers as step scope is usually correct, where that reader is created and pointed at a particular resource. A better example is a listener that you need to perhaps clean up a directory or do something after a step has completed, but because this process changes (location of directory) with each step, although you want the same actual object type for the listener, and you might even want it to do the same thing, you want it to use a new directory and a new UUID prefix or something per step, you'd have one definition for that bean perhaps using the same Job parameters but the definition would new up a temp directory or UUID etc and you'd set that bean definition to Step scope so that when you wire it into two different steps they get two different actual objects behind the lazy proxies.

Now, a Job Listener should probably be job scoped, but here raises the question, if you want the same instance across all steps and all jobs, then you use a "regular scoped" singleton and use that instead.

  • Outside of a job you cannot see job scope or step scope
  • Outside of a step you cannot see step scope

Or

  • Inside of a step you can see job and step scope beans
  • Inside of a job you can see all Job scope beans or singletons

Also To Consider:

When you create a singleton and it has, wired into it, a job scope or step scope bean, even though your singleton is the same object used "everywhere" , when it references the job or step scoped object inside (really a proxy) it will see a different one per job or step accordingly. So you can have a singleton bean that represents your step and the step can reference step scoped beans. That way your Step instance you build your Job with could and possibly should just be a singleton but it has wired into it's bean method / definition / constructor the step scoped beans it relies upon at execution to be different instances.

-1
votes

Each job is composed of three steps, a read step, a process step, and a write step. If you create a bean with Step scope, then you could reference it from each of these three steps - but you would get a different instance of that bean in each step. I you create a bean with Job scope, then you could reference it from each of these three steps - and it would be the same instance in all three contexts.

So, if you need to have one step store something in a bean that a later step in the same job will access, you want that bean to be in Job scope. If you want to guarantee that the data that any step stores and manipulates in your bean is local to that step (hidden from the others), then you want that bean to be in Step scope.