1
votes

It may be simple but I am having hard time in understanding exact trigger time of Azure Data Factory Pipeline. I followed the MS tutorial to create a DF pipeline to copy data from Blob To Azure SQL.

I created a pipeline at "1-March 16:14 IST (10:44 AM UTC)" with below scheduled,

Start date - 02/28/2017 12:00 AM UTC

End date - 03/04/2017 11:59 PM UTC

Recurring in 1 Day

After creating pipeline, it immediately run for below window,

Window Start - 02/28/2017 12:00 AM UTC

Window End - 03/01/2017 12:00 AM UTC

Attempt End - 03/01/2017 10:45 AM UTC

Attempt Start - 03/01/2017 10:44 AM UTC

Now my question is why it didn't run for window (03/01/17 12:00 AM UTC to 03/02/17 12:00 AM UTC) because pipeline was created between this time window only. I mean it ran for last day window but not for current day window.

So what is exact time when a pipeline is triggered in every window?


As asked by Paul, here are more configuration values,

Pipeline:

"policy": {
            "timeout": "1.00:00:00",
            "concurrency": 1,
            "executionPriorityOrder": "NewestFirst",
            "style": "StartOfInterval",
            "retry": 3,
            "longRetry": 0,
            "longRetryInterval": "00:00:00"
        },
        "scheduler": {
            "frequency": "Day",
            "interval": 1
        },

"start": "2017-02-28T00:00:00Z",
    "end": "2017-03-04T23:59:00Z",

Source Dataset:

"availability": {
        "frequency": "Day",
        "interval": 1
    },
    "external": true,
    "policy": {},

Destination Dataset:

"availability": {
        "frequency": "Day",
        "interval": 1
    },
    "external": false,
    "policy": {},

Below is the execution log,

Start & End Time
03/01/2017 12:00 AM UTC - 03/02/2017 12:00 AM UTC
Attempt Time : 03/02/2017 12:01 AM
1
It's definitely not simple. It's very complicated.Nick.McDermaid

1 Answers

3
votes

Can you please provide the JSON for the pipeline schedule, the dataset internals (in and out) and copy activity scheduler?

The attribute values from these 4 different blocks of code is what affects the ADF time slice behaviour. There will be something you've missed in your configuration when you've provisioned the slices. Also be mindful that time slices are very different to a SQL Agent schedule, despite the poorly named JSON attribute of 'schedule'! This is simple the start and end of the time line that is to be sliced up by defined intervals.

Additionally there are settings to state what order to run things in and when the time slice should execute. Eg; at the start or the end.

This is a handy Microsoft article that I often refer to:

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-scheduling-and-execution

Hope this helps.