1
votes

I'm trying to copy a sample data from one SQL server DB to another. For some reason the pipeline keeps waiting for source data. When I'm looking at the source dataset, there were no slices created.

The following are my JSONS:

Destination table:

{
  "name": "DestTable1",
    "properties": {
    "structure": [
      {
        "name": "C1",
        "type": "Int16"
      },
      {
        "name": "C2",
        "type": "Int16"
      },
      {
        "name": "C3",
        "type": "String"
      },
      {
        "name": "C4",
        "type": "String"
      }
    ],
      "published": false,
        "type": "SqlServerTable",
          "linkedServiceName": "SqlServer2",
            "typeProperties": {
      "tableName": "OferTarget1"
    },
    "availability": {
      "frequency": "Hour",
        "interval": 1
    }
  }
}

Source Table:

{
  "name": "SourceTable1",
    "properties": {
    "structure": [
      {
        "name": "C1",
        "type": "Int16"
      },
      {
        "name": "C2",
        "type": "Int16"
      },
      {
        "name": "C3",
        "type": "String"
      },
      {
        "name": "C4",
        "type": "String"
      }
    ],
      "published": false,
        "type": "SqlServerTable",
          "linkedServiceName": "SqlServer",
            "typeProperties": {
      "tableName": "OferSource1"
    },
    "availability": {
      "frequency": "Hour",
        "interval": 1
    },
    "external": true,
      "policy": { }
  }
}

Pipeline:

{
  "name": "CopyTablePipeline",
    "properties": {
    "description": "Copy data from source table to target table",
      "activities": [
        {
          "type": "Copy",
          "typeProperties": {
            "source": {
              "type": "SqlSource",
              "sqlReaderQuery": "select c1,c2,c3,c4 from OferSource1"
            },
            "sink": {
              "type": "SqlSink",
              "writeBatchSize": 1000,
              "writeBatchTimeout": "60.00:00:00"
            }
          },
          "inputs": [
            {
              "name": "SourceTable1"
            }
          ],
          "outputs": [
            {
              "name": "DestTable1"
            }
          ],
          "policy": {
            "timeout": "01:00:00",
            "concurrency": 1
          },
          "scheduler": {
            "frequency": "Hour",
            "interval": 1
          },
          "name": "CopySqlToSql",
          "description": "Demo Copy"
        }
      ],
        "start": "2017-10-22T09:55:00Z",
          "end": "2017-10-22T13:55:00Z",
            "isPaused": true,
              "hubName": "wer-dev-datafactoryv1_hub",
                "pipelineMode": "Scheduled"
  }
}    

I can see the process in the monitor view, but the pipeline is stuck and waiting for the source data to arrive.

What am I doing wrong?

1
In your pipeline configuration I can see "isPaused": true - are you sure your pipeline is not paused actually?arghtype

1 Answers

0
votes

Schedule can be a bit tricky initially. there are few reasons why a time slice might be waiting on a trigger

Activity Level

  1. Source Properties

Setting “external”: ”true” and specifying externalData policy information the Azure Data Factory service that this is a table that is external to the data factory and not produced by an activity in the data factory.

  1. Concurrency (not likely in your case) : An activity can also be held up if multiple slices of the activity are valid in the specific time window. for example your start / end date is 01-01-2014 to 01-01-2015 for a monthly activity, if the concurrency is set to 4, 4 months will run in parallel while the rest are stuck with the message "Waiting on Concurrency"

Pipeline Level

  1. Ensure that the DateTime.Now lies between the start and end accounting for the delay. More on how the scheduling of activities work is explained in this article https://blogs.msdn.microsoft.com/ukdataplatform/2016/05/03/demystifying-activity-scheduling-with-azure-data-factory/

  2. Paused : a pipeline can be paused, in which case the time slice will appear in the monitor with the message "Waiting for the pipeline to Resume". You can either author the pipeline JSON and make paused : true or can even resume the pipeline by right clicking and hitting resume.

A good way to check when your next iteration is scheduled is by using the Monitor option

enter image description here