1
votes

I'm working on a Spring Batch job that creates a string that is based off of sql insert, delete, or update statements. It reads in a flatfile where the first three characters of each line are either add, chg, or del.

Example:

ADD123456001SOUTHLAND PAPER INCORPORATED  ... //more info
CHG123456002GUERNSEY BIG DEAL FAIRFAX     ...//more info
DEL123456002GUERNSEY BIG DEAL FAIRFAX     ...//more info

From the above statements my ItemReader will generate three strings: insert into ..., update ... and delete .... The reader reads in the entire flatfile, returns an arraylist of these strings to my writer, and my writer will take these strings and write to my database.

Here's my problem. What happens if there's a chg requested before an add is requested? What if I try changing something that's already deleted?

I read up on ItemProcessor on SpringDocs and the description of filtering processes is exactly what I'm trying to do:

For example, consider a batch job that reads a file containing three different types of records: records to insert, records to update, and records to delete. If record deletion is not supported by the system, then we would not want to send any "delete" records to the ItemWriter. But, since these records are not actually bad records, we would want to filter them out, rather than skip. As a result, the ItemWriter would receive only "insert" and "update" records.

But the examples of ItemProcessor listed on the docs don't really make sense to me. Can someone make sense of the process to me? Or show me some examples of good ItemProcessing?

Edit: the 6 characters following the command are the id associated in the SQL database.

1

1 Answers

1
votes

In the case described in the question you're not filtering out records, you just want to change the order they come through in. You'd be better off here sorting the file in an earlier step (to do your inserts first, then your updates, then your deletes). ItemProcessor is more for filtering out the occasional bad or irrelevant input line.

You could use the ItemProcessor to validate that the row updated or deleted exists, or that the row to be added isn't already present. Here I would wonder if the amount of querying you'd have to do in the ItemProcessor (one query per row in the input file) wouldn't be a lot of overhead for a condition that might only happen occasionally. Your choice would be between

  • using the ItemProcessor to filter (doing a query up front for each row), or
  • not doing any up-front queries, but having the ItemWriter skip these rows instead if RI is violated (rolling back the chunk and retrying one line at a time), see Spring Batch skip exception for ItemWriter.