0
votes

Data factory pipeline using the copy activity from a source data warehouse -> staging blob storage -> sink data warehouse.

The copy from source to blob works as expected (rows are copied). The copy from staging to sink fails - 0 rows copied

Disabling Polybase , and using bulk insert works.

{
    "name": "PI_TEST",
    "properties": {
        "activities": [
            {
                "name": "MaterializedEventIdFilter_Copy",
                "type": "Copy",
                "dependsOn": [],
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [
                    {
                        "name": "Destination",
                        "value": "[formigration].[MaterializedEventIdFilter]"
                    }
                ],
                "typeProperties": {
                    "source": {
                        "type": "SqlDWSource",
                        "sqlReaderStoredProcedureName": "[formigration].[proc_GetStgMaterializedEventIdFilter]"
                    },
                    "sink": {
                        "type": "SqlDWSink",
                        "allowPolyBase": true,
                        "writeBatchSize": 100000,
                        "polyBaseSettings": {
                            "rejectValue": 0,
                            "rejectType": "value",
                            "useTypeDefault": false
                        }
                    },
                    "enableStaging": true,
                    "stagingSettings": {
                        "linkedServiceName": {
                            "referenceName": "riskstoreprd",
                            "type": "LinkedServiceReference"
                        },
                        "enableCompression": true
                    }
                },
                "inputs": [
                    {
                        "referenceName": "ioPrePrdMaterializedEventIdFilter",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "CloudPrdMaterializedEventIdFilter",
                        "type": "DatasetReference"
                    }
                ]
            },
            {
                "name": "MaterialisedEvent",
                "type": "SqlServerStoredProcedure",
                "dependsOn": [
                    {
                        "activity": "MaterializedEventIdFilter_Copy",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 2,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "storedProcedureName": "[formigration].[proc_SetStgMaterializedEventIdFilter]"
                },
                "linkedServiceName": {
                    "referenceName": "cloud_prd",
                    "type": "LinkedServiceReference"
                }
            }
        ],
        "annotations": []
    },
    "type": "Microsoft.DataFactory/factories/pipelines"
}

I expected the data from the blob to make it into the sink but no rows are copied.

Edit 1: Checked the data warehouse (sink) a connection is made where I can observe the external tables etc created form the blob storage all within a second, yet no data is copied in.

INSERT INTO [formigration].[MaterializedEventIdFilter] SELECT * FROM [ADFCopyGeneratedExternalTable_307e2c7f-a56f-4b75-86fb-10ab0cb94548]
1

1 Answers

1
votes

In polybase, external tables are just a reference to a blob storage folder/file and they dont have any rows. If you want to actually copy data into your warehouse, create a regular table and use it as a sink in your copy activity!!

Hope this helped!