4
votes

I have so far successfully used BIML to auto generate SSIS package (from CSV to SQL Server). But I got into problems where ever I have Varchar(MAX) columns in the Flat File Format.

The problem is If I define a column of type AnsiString with size -1 in the Flat file format, the output SSIS package shows the below warning

The metadata of the following output columns does not match the metadata of the external columns with which the output columns are associated.

If I click Yes, the problem is fixed by itself, but that would be my last option as I have 150 packages.

When I checked the Advanced options of Flat File Source Component I can see a difference in data type for the column Comments, External Columns show as DT_TEXT where as the Output Columns show DT_STR. :(

What I don't understand is why the Output columns showing a different data type only for Varchar(Max) when all others are working fine. Aren't the output columns generated from External columns?

Please see the biml code below.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
 <FileFormats>
        <FlatFileFormat Name="MetadataFileFormat" RowDelimiter="LF" ColumnNamesInFirstDataRow="true" IsUnicode="false">
            <Columns>
                <Column Name="Category" DataType="AnsiString" Length="128" Delimiter="|" CodePage="1252" />
                <Column Name="Comments" DataType="AnsiString" Length="-1" Delimiter="|" />
                <Column Name="DisplayName" DataType="AnsiString" Length="256" Delimiter="CRLF" />
            </Columns>
        </FlatFileFormat>
    </FileFormats>
	
	<Connections>
		<FlatFileConnection Name="FF_Test" FilePath="C:\Data\Sample.csv" FileFormat="MetadataFileFormat">
		</FlatFileConnection>
	</Connections>
	
	<Packages>
		<Package Name="FFTest" ConstraintMode="Linear">
			<Tasks>
				<Dataflow  Name="DFT Load Data">
					<Transformations>
						<FlatFileSource Name="FF_SRC" ConnectionName="FF_Test">
						</FlatFileSource>
					</Transformations>
				</Dataflow>
			</Tasks>
		</Package> 
	</Packages>
</Biml>

See the difference between External and Output columns here

2

2 Answers

4
votes

Within a dataflow a DT_STR is bounded between lengths of 0 to 8000. The Flat File Connection Manager is happy to let you specify a length greater than 8k.

enter image description here

However, when you try to use that in a data flow, the component is going to report that it's not a valid length

enter image description here

And it makes sense if you know the concepts of how SSIS gets the performance out of data flow. It preallocates memory and does all the transformations in that memory space. How much memory would you allocate for a MAX type? Exactly...

So, you're going to need to use one of the stream data types: DT_TEXT or DT_NTEXT. Those allow for unlimited length strings.

Biml

I'm actually stumped on this, hopefully Scott can chime in. The emitted DTSX will look as the before screenshot with a data type of DT_STR and length of zero. It runs fine, just looks bad. When you double click to let the editor fix it, it changes to DT_TEXT as it should.

enter image description here

I thought it was just going to be a matter of providing a data type override as we can in an Execute SQL Task, but to no avail, it's not a property on the Columns collection in the flat file source.

Perhaps this was a situation where I needed to mess with the Dataflow overrides property...

                        <DataflowOverrides>
                            <OutputPath OutputPathName="Output">
                                <Columns>
                                    <Column 
                                        ColumnName="Comments" 
                                        DataType="AnsiString" 
                                        CodePage="1252"
                                        Length="-1" 
                                        ></Column>
                                </Columns>
                            </OutputPath>
                        </DataflowOverrides>

But no, that gave me no better result.

Fine, I gave up and "cheated" by using Mist/BimlOnline to reverse engineer the corrected package back into Biml.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <FlatFileConnection Name="FF_Test" FilePath="C:\ssisdata\SO\Input\so_35438946.txt" FileFormat="FF_Test" />
  </Connections>
  <Packages>
    <Package Name="so_35438946_re" Language="None" VersionBuild="1" CreatorName="BillFellows" CreatorComputerName="AVATHAR" CreationDate="2016-02-16T13:02:49">
      <Tasks>
        <Dataflow Name="DFT Load Data">
          <Transformations>
            <DerivedColumns Name="DER Placeholder">
              <InputPath OutputPathName="FF_SRC.Output" />
            </DerivedColumns>
            <FlatFileSource Name="FF_SRC" LocaleId="None" FileNameColumnName="" ConnectionName="FF_Test" />
          </Transformations>
        </Dataflow>
      </Tasks>
      <Connections>
        <Connection ConnectionName="FF_Test" />
      </Connections>
    </Package>
  </Packages>
  <FileFormats>
    <FlatFileFormat Name="FF_Test" CodePage="1252" TextQualifer="_x003C_none_x003E_" ColumnNamesInFirstDataRow="true" RowDelimiter="LF">
      <Columns>
        <Column Name="Category" Length="128" DataType="AnsiString" Delimiter="VerticalBar" MaximumWidth="128" />
        <Column Name="Comments" Length="-1" DataType="AnsiString" Delimiter="VerticalBar" />
        <Column Name="DisplayName" Length="256" DataType="AnsiString" Delimiter="CRLF" MaximumWidth="256" />
      </Columns>
    </FlatFileFormat>
  </FileFormats>
</Biml>

And now I simply Generate SSIS package and... Well, I suppose it's progress. Comments is identified as DT_TEXT but I still get the warning.

enter image description here

Deep dive into the dtsx

In the data flow's flat file source, the external metadata collection for this column is defined as follows

<externalMetadataColumn
    codePage="1252"
    dataType="str"
    name="Comments"
    refId="Package\DFT Load Data\FF_SRC.Outputs[Output].ExternalColumns[Comments]"></externalMetadataColumn>

In the on we let the editor adjust

<externalMetadataColumn
    refId="Package\DFT Load Data\FF_SRC.Outputs[Output].ExternalColumns[Comments]"
    codePage="1252"
    dataType="text"
    name="Comments" />

and the one emitted from VS 2013 using the original code, we get

<externalMetadataColumn 
    codePage="1252" 
    dataType="str" 
    name="Comments" 
    refId="Package\DFT Load Data\FF_SRC.Outputs[Output].ExternalColumns[Comments]">
</externalMetadataColumn>

It might be distasteful but perhaps a bit of XSLT could find any of the instances where you have this named column and data type of str and transform it to text

0
votes

I didn't try, but found it on Varigance documentation:

<!-- A Length of -1 will automatically be converted to nvarchar(max)/varchar(max) -->
<Column Name="LongString" DataType="String" Length="-1" />