2
votes

I am using BIML and BIDSHelper to create SSIS package. I am trying to import data from csv to sql server. I want to create table in the destination database before the dataflow happens. Here is my code:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>       
    <OleDbConnection Name="CM_OLE" 
                     ConnectionString="Data Source=(localdb)\projects;Initial Catalog=test;Integrated Security=SSPI;Provider=SQLNCLI11">
    </OleDbConnection>
    <FlatFileConnection
            Name="FF Source"
            FileFormat="FFF Source"
            FilePath="F:\test.csv"
            CreateInProject="false" />
</Connections>
<FileFormats>
    <FlatFileFormat
            Name="FFF Source"
            CodePage="1252"
            RowDelimiter="CRLF"
            ColumnNamesInFirstDataRow="true"
            IsUnicode="false"
            FlatFileType="Delimited"
            TextQualifer="_x0022_"
            HeaderRowsToSkip="0">
        <Columns>               
            <Column Name="Column1" Length="50" InputLength="50" MaximumWidth="50" DataType="AnsiString"  ColumnType="Delimited"  CodePage="1252" Delimiter="," TextQualified="true" />
            <Column Name="Column2" Precision="10" Scale="2"  DataType="Decimal"  ColumnType="Delimited"  CodePage="1252" Delimiter="CRLF" TextQualified="true"  />
        </Columns>
    </FlatFileFormat>
</FileFormats>  
<Packages>      
    <Package ConstraintMode="Linear" Name="NumericParsingFromFlatFileInsertIdentity">
        <Tasks> 
            <ExecuteSQL Name="Create table sometablename" ConnectionName="CM_OLE">
                 <DirectInput>
                      CREATE TABLE sometablename(column1 varchar(50) NOT NULL, column2 varchar(10,2) NOT NULL);
                      GO 
                 </DirectInput>
            </ExecuteSQL>
            <Dataflow Name="DFT Source">
                <Transformations>
                    <FlatFileSource ConnectionName="FF Source" Name="FF Source" />
                    <OleDbDestination ConnectionName="CM_OLE" Name="OLEDB DST">
                        <ExternalTableOutput Table="sometablename"></ExternalTableOutput>
                    </OleDbDestination>                     
                </Transformations>
            </Dataflow>         
        </Tasks>
    </Package>
</Packages>

When I try to generate package it says cannot execute query select * from sometablename invalid object name. I understand that table sometablename doesnot exist so it throws the error. So, How could I create the table automatically? I have read the series BI Thoughts and Theories. Part 2 shows way to create table. My understanding is that at the end it also create ExecuteSQl to create the table. I am confused how to run table creation script before the dataflow or what other alternative BIML has to offer?

Thanks in advance

3

3 Answers

2
votes

It seems what you're trying to do is not possible with BIML.

SSIS dataflows require ALL external column metadata to be available at design time. There is no way around this, so the Biml compiler is required to query the data source to get this information, which is then emitted into the package. BIDS/SSDT does this validation constantly as you are working. Biml does it only at build time.

The purpose of ValidateExternalMetadata=false is actually for SSIS to refrain from checking that the external columns defined in the dataflow metadata still match the external data source during the validation phase when the package is run. But at design/build time, we still need that metadata to exist so that we can create the external column metadata in the first place. To be clear, this is true both for native BIDS/SSDT and for Biml.

ValidateExternalMetadata was provided by the SSIS team for scenarios such as dynamically creating tables or files that will match a predetermined schema. Usually you would have the schema prebuilt on your dev environment (which you build against) and then dynamically create the same schema on production as it's needed. Disabling validation means that you can do the dynamic creation as part of the same package that reads from or loads into those dynamically created objects.

We do recognize that there's a need to do builds without having the schema materialized in Dev either. One of the things we're looking at doing in a future release is an "Offline Metadata" feature that would allow you to use Biml to declare your dataflow metadata without having to retrieve it at build time. There would be some scripting work on the user's part to construct the metadata to match what it will look like at run time, but if they get that right, scenarios like yours will be enabled.

What you could do is add the ValidateExternalMetadata="false" to your OLE DB Destination. Create the table manually on your development environment and then generate the package.

It should execute without problems on any other environment because you set ValidateExternalMetadata to false.

1
votes

For anyone else trying to achieve this, Biml can now reference objects that don't exist through the OfflineSchema metadata elements. This allows you to specify tables or result sets that you can't connect to for the Biml engine to base the SSIS build on.

https://varigence.com/Documentation/Language/Element/AstOfflineSchemaNode

0
votes

On a somewhat related note, check out Samuel Vanga's article and pay attention to the "Create Objects" aspect. Running that package will create your tables in the DB, after which you can Generate the SSIS package that relies on those tables.

I used his example to implement the following workflow:

  1. Read Excel Workbook spreadsheets for field names, datatypes (this is a template given to clients when asking for data in flat files)
  2. Populate metadata tables with flat file names/ids, fields [names, datatypes, delimiters, precision, scale, etc]
  3. Read metadata tables to inform flat file sources, create staging tables, create packages that read flat file and populate staging tables.