1
votes

I have an SSIS package that is inputting data from a Flat File into a SQL 2008 database table. The 3rd party generates the Flat File (.csv) every day. There are leading whitespaces in every field that I need to remove.

I thought a Script Component would do the trick?

I want to have it loop through all of the Input Columns and LTrim(RTrim) all of the values for every column.

I found this code here: http://microsoft-ssis.blogspot.com/2010/12/do-something-for-all-columns-in-your.html

But, I don't know how to change it to Trim the values?

I tried changing the "ValueOfProperty.ToUpper()" to "ValueOfProperty.Trim()" but then it causes an error on the component "Error 30203: Identifier expected..."

Help please??

Here's my SSIS Data Flow:

Flat File > Data Conversion > Script Component > OLE DB Destination

Data Flow

    ' This script adjusts the value of all string fields
Imports System
Imports System.Data
Imports System.Math
Imports System.Reflection ' Added
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<microsoft .sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute=".sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute"> _
<clscompliant false="false"> _

Public Class ScriptMain
Inherits UserComponent

' Method that will be started for each record in you dataflow 
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    ' Use Reflection to loop through all the properties of Row:
    ' Example:
    ' Row.Field1            (String)
    ' Row.Field1_IsNull     (Boolean)
    ' Row.Field2            (String)
    ' Row.Field2_IsNull     (Boolean)
    Dim p As PropertyInfo
    For Each p In Row.GetType().GetProperties()
        ' Do something for all string properties: Row.Field1, Row.Field2, etc.
        If p.PropertyType Is GetType(String) Then
            ' Use a method to set the value of each String type property
            ' Make sure the length of the new value doesn't exceed the column size
            p.SetValue(Row, DoSomething(p.GetValue(Row, Nothing).ToString()), Nothing)
        End If
    Next
End Sub

' New function that you can adjust to suit your needs
Public Function DoSomething(ByVal ValueOfProperty As String) As String
    ' Uppercase the value
    ValueOfProperty = ValueOfProperty.ToUpper() 'Maybe change this to Trim()?
    Return ValueOfProperty
End Function

End Class
2

2 Answers

2
votes

I think you could achieve this with a Data Conversion task instead of a script task in SSIS. I think that might be more straightforward and you can simply apply an expression on each column to trim, either adding a new column to your set and replacing your old column OR updating the value returned by the same column to then be used by your next data flow task.

I personally think this will be a little nicer to do in the GUI drag/drop wise and you won't have to consider the script task at all in that case! Only just the expression to use, which I am linking some MSDN documentation on. You can then set it explicitly based on each column you have in your flat file data source.

Trim documentation on MSDN: https://msdn.microsoft.com/en-us/library/ms139947.aspx

Don't have an SSDT or VSBI install on this here home machine or I'd set up a simple example of the data flow and take a screenshot.

Another seemingly useful article on Trim in SSIS: http://www.bradleyschacht.com/trim-functions-in-ssis/

1
votes

I figured out how to get the script to Trim all of the values with the For Each Loop. This is helpful when you have a lot of columns, or want to use this same solution for multiple packages.

As Dinglemeyer and Jim stated in the comments, the downside is that SSIS will not reflect what's happening in the script itself. I have included the Alternate Method of using a Derived Column object below.

Alternate Method:

Use a Derived Column object.

  1. In the Derived Column Transformation Editor, under ‘Derived Column’ use the drop-down to select Replace ‘your_column_name’

  2. Under ‘Expression’ type in the expression you want to apply to that column, ie LTRIM(RTRIM(your_column_name))

enter image description here

The only downside is that you have to enter each column manually. This can be a prolonged process if you have a lot of columns.


ANSWER: Script to Auto Trim

  1. Add a Script Component between your Source and Destination in SSIS. I added mine after the Data Conversion

enter image description here

  1. Edit the Script Component. Select the Output columns only from the previous Task Object. In my case, the Output Columns from the Data Conversion. Change each column to 'ReadWrite'

enter image description here

  1. Select Script and then Design Script

  2. Copy/Paste the below code. Save and Run

You can modify this line in the script to any other normal expression:

ValueOfProperty = LTrim(RTrim(ValueOfProperty))

enter image description here

Imports System
Imports System.Data
Imports System.Math
Imports System.Reflection ' Added
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper



Public Class ScriptMain
Inherits UserComponent

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    '
    ' Use Reflection to loop through all the properties of Row:
    ' Example:
    ' Row.Field1            (String)
    ' Row.Field1_IsNull     (Boolean)
    ' Row.Field2            (String)
    ' Row.Field2_IsNull     (Boolean)
    Dim p As PropertyInfo
    For Each p In Row.GetType().GetProperties()
        ' Do something for all string properties: Row.Field1, Row.Field2, etc.
        If p.PropertyType Is GetType(String) Then
            ' Use a method to set the value of each String type property
            ' Make sure the length of the new value doesn't exceed the column size
            If (p.CanWrite) Then p.SetValue(Row, DoSomething(p.GetValue(Row, Nothing).ToString()), Nothing)
        End If

    Next
    '
End Sub
' New function that you can adjust to suit your needs
Public Function DoSomething(ByVal ValueOfProperty As String) As String
    ' Uppercase the value
    'ValueOfProperty = ValueOfProperty.ToUpper()

    'Trim Leading (LTrim) and Trailing (RTrim) Whitespace 
    'Change this to equal any normal expression
    ValueOfProperty = LTrim(RTrim(ValueOfProperty))

    Return ValueOfProperty
End Function
End Class

enter image description here

Some errors:

"The task is configured to pre-compile the script, but binary code is not found. Please visit the IDE..." This is a bug. Go into the Script and delete ‘End Class’. Save. Add ‘End Class’ back at the bottom of your script and Save again.

Property method not found. Added the ‘CanWrite’ if/then statement. Make sure your columns are set to ReadWrite, or they will be skipped.

Error 30203: Identifier expected Make sure the columns you have selected under ‘Input Columns’ in the Script Component are valid. In my case it was showing all columns from the Input and Output of the Data Conversion just prior to this Script Component. Needs to be the Output from the previous Data Flow object, so select only those columns.

The original code had this:

<microsoft .sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute=".sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute"> _
<clscompliant false="false"> 

Which caused an error at the =”.sqlserver. I removed these two lines

Thanks, and hopefully someone finds this useful! :)