1
votes

I'm wondering if anyone has any experience in calling datasets dynamically in Azure Data Factory. The situation we have is that we dynamically sweep all tables in from IaaS (on-premise SQL Server installations on an Azure VM) application systems to a data lake. We want to have one pipeline that can pass server name, database name, user name and password to the pipeline's activities. The pipelines will then sweep whatever source they've been told to read from the parameters. The source systems are currently within a separate subscription and domain within our Enterprise Agreement.

We have looked into using the AutoResolveIntegrationRuntime on a generic SQL Server dataset but, as it is Azure and the runtimes on the VMs are self-hosted, it can't resolve and we get 'cannot connect' errors. So,

i) I don't know if this problem goes away if they are in the same subscription and domain?

That leaves whether anyone can assist with:

ii) A way of getting a dynamic runtime to resolve which SQL Server runtime it should use (we have one per VM for resilience purposes, but they can all see each other's instances). We don't want to parameterise a linked service on a particular VM as it places reliance for other VMs on that single VM.

iii) Ability to parameterise a dataset to call a runtime (doesn't look possible in the UI).

iv) Ability to parameterise the source and sink connections with pipeline activities to call a dataset parameter.

1

1 Answers

0
votes

Servers, database, tableNames are possible to be dynamic by using parameters. The key problem here is that all the reference in ADF can’t be parameterized, like linked services reference in dataset, integrationRuntime reference in linked service. If you don’t have too many selfhosted integrationRuntime, maybe you can try setup different pipelines for different network?