3
votes

I created some pipelines in my Azure Data Factory service to move data from SQL Tables to Azure Tables. But they never start running. Instead, the source data sets remain pending validation even after I click the run button in Azure Portal. I have already checked the external properties, which are all set as true. I wonder if there are any other possible reasons.

And here is my table source

{
    "name": "TableSrc",
    "properties": {
        "published": false,
        "type": "AzureSqlTable",
        "linkedServiceName": "LinkedService-AzureSql",
        "typeProperties": {
            "tableName": "myTable"
        },
        "availability": {
            "frequency": "Month",
            "interval": 1
        },
        "external": true,
        "policy": {}
    }
}
4

4 Answers

4
votes

I ran into this trying to set up a pipeline to run daily, and thought that I could use the "anchorDateTime" availability property and I was able to do this but you have to set the "frequency" property of the "availability" section in the dataset to the lowest level of granularity that you want to specify. That is, if you want something to run at 6:30pm UTC every day, your dataset needs to look like this (because you are specifying a time at the minute-level):

"availability": {
    "frequency": "Minute",
    "interval": 1440,
    "anchorDateTime": "2016-01-27T18:30:00Z"
}

and the "scheduler" portion of the pipeline needs to be something like:

"scheduler": {
    "frequency": "Minute",
    "interval": 1440,
    "anchorDateTime": "2016-01-27T18:30:00Z"
}

This will run every 1440 minutes (i.e. every 24 hours). I hope that it helps somebody else out since the Microsoft documentation contradicts itself on this topic (or at least is misleading):

For a daily schedule, if you set anchorDateTime = 10/20/2014 6 AM means that the scheduling will happen every day at 6 AM.

This is actually not true, and two lines later it says:

If the AnchorDateTime has date parts that are more granular than the interval, then the more granular parts will be ignored. For example, if the interval is hourly (frequency: hour and interval: 1) and the AnchorDateTime contains minutes and seconds, then the minutes and seconds parts of the AnchorDateTime will be ignored.

This second part is what I think we're running into and why I suggested the strategy above.

reference: https://msdn.microsoft.com/en-us/library/azure/dn894092.aspx

1
votes

I got the reason... It will wait for the next rounded month to start. Which means it will start at the first day of next month, and no way to manually trigger it.

1
votes

I was getting the same problem. Turns out that I had not specified the start time of the pipeline according to UTC.

1
votes

Well, if you want your pipeline to be run, update active periods to dates in the past. You can do it using below powershell command

set-AzureDataFactoryPipelineActivePeriod -DataFactoryName $DataFactoryName -PipelineName $PipelineName -StartDateTime $DateInPast -EndDateTime $DateOneDayLessInPast -ResourceGroupName $ResourceGroupName -Force