0
votes

I need some clarity on Data Factory and Data Movement - Data Factory (v1 & v2) are both only available in a select few regions but Data Movement is available in many regions. I'd like to understand the relationship between Azure Data Factory and Azure Data Movement as they relate to the Azure regions (https://azure.microsoft.com/en-gb/global-infrastructure/services/) and if they are related or totally different products.

For example, if I have a Data Factory in North Europe, will all data processed by this data factory ALWAYS pass through the North Europe region, even if the source and destinations are both, say, in East US? What I am trying to understand is, does Data Factory do anything clever under the hood to use a Data Movement service in the most appropriate region based on where the data is flowing from and to.

A second abstract example would be data loading from a blob store in Australia to a SQL DB in Australia. I know there are other ways to do this, but say I had to use Data Factory. ADF is not available in Australia, so I would stand it up in somewhere like North Europe, would my data travel from Australia to North Europe and back to Australia? Or would the data movement aspect of ADF be more clever and do that locally?

A further aspect wound be for Integration Runtime - does IR always being data back to the region the ADF is hosted in for processing?

Thanks.

1

1 Answers

1
votes

Data Movement service is part of Data Factory service, which is the real compute env to do data transfer. This means when you are copying data from Australia to Australia, no matter where the Data Factory is (e.g. in East US), it will use the Data Movement service in Australia to finished the Copy. Data Factory region is the region to store your data factory metadata.

For self-hosted IR, data flow won't go back to Data Factory. Self-hosted IR will connect to both source and sink data source to transfer data.(see more details from https://docs.microsoft.com/en-us/azure/data-factory/create-self-hosted-integration-runtime#command-flow-and-data-flow)

Here're some more details if you are taking care about region: for Cloud Copy, Copy Activity will auto detect the sink data source region and use Data Movement service in that region to finish Copy. When authoring a new pipeline from CopyWizard UI, you will see the region to be used. And when Copy finished, you can also see the region execution region in summary page.

Regards, Gary