1
votes

I'm fairly proficient in SSIS, but by no means an expert. I have been tasked with importing a fairly complicated fixed width flat file with header and trailer rows. Now, without the header and trailer rows I feel confident I could do this fairly easily. However, I am stumped on how to do this with the extra rows. This is a simplified version for this example. First row is header row, last row trailer row between is any amount of data.

0 201308190102
1 123412341234   MR            JOE              BLOGGS                19810529
1 123412341235   MS            SUE              BLOGGS                19810516
9 201308190102

Columns are in header row : action type 0 = header, today's date YYYYMMDD, sequence number 01, expected row count 02

data row : action type 1,2,3 dependant on create, update, delete, card number 123412341234, title, firstname, lastname, date of birth trailer row : action type 9 = footer, today's date, sequence number, actual row count.

I'll need to do validation checks on the date, sequence number and row count of the files.

Now, I'm not sure how to set this up in the flat file connection manager as the header/trailer detail is in the same position as the card numbers.

So my thinking is split the header and trailer rows out by row number and end of file or something? Would I need to split them both out in a script task? Would I need to setup the flat file connection line by line and split out the columns by script task?

2
Headers can be skipped. Footers, unfortunately throw a spanner into the works. How many rows would you guesstimate the file to average?billinkc
And the expected row count in header and footer, that is exclusive of the header and footer? Mainframe folks I dealt with always used inclusivebillinkc
hi billin, the system i'm creating would need to deal with creating new records, updating records, marking records as inactive anything up to and over 100000 records in one file...jhowe

2 Answers

2
votes

If you can confidently determine the header and footer rows from the data, I suggest using a conditional split. For example, in a past project the first two characters of the header and footer rows were "HR" and "TR" respectively.

With that convention I was able separate the rows I actually needed to process. In a separate path I did some validation based on header row data. Finally, I merged the two paths as I used the date in the header for a processing date column in destination table.

enter image description here

0
votes

If you have some experience coding in c# I would recommend a scrip task. I ran into the same problem while parsing some license files and it seemed to be the only way to parse more complex files correctly.

If you do a scrip task you can bypass the flat file connection etirely. Here's a brief how to:

First add a String parameter to your package. In your scrip task options add your parameter as a readonly variable. Then access it with this snippet

byte[] empty = new byte[] { };
    base.PreExecute();
    //Outside parameter for file path
    //Check for the parameter existance
    IDTSVariable100 filePathVariable; //This is the var that will hold the parameter
    try
    {
        filePathVariable = this.ReadOnlyVariables["ParameterName"];
    }
    catch (Exception)
    {
        this.Log(@"The package configuration is invalid. The variable / parameter ""ParameterName"" is missing. 
            It has to be available and be of type string and give the path to the file to be Imported", 0, empty);
        throw;
    }
    string filePath = filePathVariable.Value.ToString();

Now use something like this to read the file and split it into lines..

try
    {
        using (StreamReader reader = new StreamReader(filePath))
        {
            //Read file and split into lines
            string fileStream = reader.ReadToEnd();
            fileLines = fileStream.Split('\n');
        }
    }
    catch (Exception e)
    {
        this.Log(@"Error reading file into list of strings. Reason: " + e.Message, 0, empty);
        throw;
    }

Put all of the above in the PreExecute section of your script task and what ever transformations you want to each line. Then create an array or list of lines and output it like this:

  public override void CreateNewOutputRows()
{
    for (int i = 0; i < lineCount; i++)
    {
        ItemsBuffer.AddRow();
        ItemsBuffer.ColumnName = CreatedLineList[i];
    }
}

if you have any questions about it let me know