4
votes

The ragged right flat file option is great for a single trailing field terminated with a CRLF, but when any number of trailing fields which are blank are suppressed and the line terminated early with CRLF, I have found no alternative in SSIS but to read a few columns in with a single "REMAINDER" column and then used a Derived Column transform with SUBSTRING operations to extract the "optional" columns one by one.

One possibility which has occurred to me is if there is any way to pass the line through a "padder" component within SSIS prior to the connection manager getting a hold of it. Then you could get all the benefits of using the connection manager to set up the incoming column lengths. I could always create an external program (or a completely separate SSIS dataflow) to pass a file through which would pad every row out with spaces (and with ragged right - just adding the same minimum number of spaces to account for all suppressed fields to every line would be sufficient), but that does seem rather wasteful of disk space and also require the file to be read (and written) completely one more time.

Alternatively, perhaps there is a third party data source component which will simply NULL or blank and columns automatically if the line is terminated prematurely.

Any other options?

2

2 Answers

2
votes

I think your proposed solution sounds reasonable. Disk space is cheap. It is perfectly acceptable to have a two-step process in which you clean the formatting before you import the data.

1
votes

Use a script component to separated the column as the data source.