1
votes

My goal is to pass data from one SQl azure database (User DB) to another SQl azure database (Datawarehouse) through a stored procedure.

I have created two linked Services, one for each DB. And two DataSets of which I have doubts.

The stored procedure in question collects data from a table and several joins with other tables and returns a result that should be stored in a table in the Datawarehouse

The SP is like this:

ALTER PROCEDURE [DataWarehouse].[Item_init]
AS
BEGIN
    SET NOCOUNT ON
    SELECT Id, a.Name, Code, f.Name, s.Name, g.Name
    FROM Item.Item a
        join Item.Groupg on g.idGroup= a.idGroup
        join Item.Subfam s on s.idSubfam = g.idSubfam 
        join Item.Fam f on f.idFam= s.idFam

END

The dataset that collects data from the UserDB (I think it is not correct) is like this:

{
    "name": "ds_SProcItem_init",
    "properties": {
        "published": false,
        "type": "AzureSqlTable",
        "linkedServiceName": "UserTable",
        "typeProperties": {
            "tableName": "Item.Item"
        },
        "availability": {
            "frequency": "Hour",
            "interval": 1
        }
    }
}

The other dataset:

{
    "name": "ds_DWItemOutput",
    "properties": {
        "published": false,
        "type": "AzureSqlTable",
        "linkedServiceName": "DataWareHouse",
        "typeProperties": {
            "tableName": "Item"
        },
        "availability": {
            "frequency": "Hour",
            "interval": 1
        }
    }
}

The pipeline that communicates the datasets is as follows:

{
    "name": "SprocItem_InitPipeline",
    "properties": {
        "activities": [
            {
                "type": "SqlServerStoredProcedure",
                "typeProperties": {
                    "storedProcedureName": "DataWarehouse.Item_init"
                },
                "inputs": [
                    {
                        "name": "ds_SProcItem_init"
                    }
                ],
                "outputs": [
                    {
                        "name": "ds_DWItemOutput"
                    }
                ],
                "scheduler": {
                    "frequency": "Hour",
                    "interval": 1
                },
                "name": "SprocItem_Init"
            }
        ],
        "start": "2016-08-02T00:00:00Z",
        "end": "2016-08-02T05:00:00Z",
        "isPaused": false,
        "hubName": "pruebasaas_hub",
        "pipelineMode": "Scheduled"
    }
}

Please, someone who knows the subject, could you help me? Thanks!

1

1 Answers

0
votes

Given the limits of Azure SQL DB I suggest you need to use a copy activity here as well as the stored procedure. You need to handle this within the confines of how ADF wants to work. Remember this isn't SSIS :-)

If I was building the data factory these are the steps I'd take...

  1. For completeness define datasets of each of the tables used by the stored procedure.
  2. First pipeline. Have an activity that calls the stored procedure that does the joins of the input datasets and outputs to a new staging table (do a SQL INSERT INTO ... SELECT... here) on the first Azure SQL DB instance.
  3. Have the output dataset in ADF for the staging table defined (the proc result).
  4. Second pipeline. Have a copy activity from the output staging table in point 3 as the input. Then output to the table on the second Azure SQL DB instance.
  5. Again for completeness an ADF dataset for the final destination table.

The copy activity bridges the gap where cross database queries aren't possible and SQL Server Linked Servers don't exist.

Picture to help...

(Please forgive the poor paint skills)

enter image description here

Make sense? :-) Good, crack on.