1
votes

I have a copy data pipeline in the Azure Data Factory. I need to deploy the same Data Factory instance in multiple environments like DEV, QA, PROD using Release Pipeline.

The pipeline transfer data from Customer Storage Account (Blob Container) to Centralized Data Lake. So, we can say - its a Many to One flow. (Many customers > One Data Lake)

Now, suppose I am in DEV environment & I have 1 demo customer there. I have defined an ADF pipeline for Copy Data. But in prod environment, the number of customers will grow. So, I don't want to create multiple copies of the same pipeline in production Data Factory.

I am looking out for a solution so that I can keep one copy pipeline in Data Factory and deploy/promote the same Data Factory from one environment to the other environment. And this should work even if the number of customers is varying from one to another.

I am also doing CI/CD in Azure Data Factory using Git integration with Azure Repos.

1

1 Answers

1
votes

You will have to create additional linked services and datasets which do not exist in a non-production environment to ensure any new "customer" storage account is mapped to the pipeline instance.

With CI/CD routines, you can deliver this in an incremental manner i.e. parameterize you release pipeline with variable groups and update the data factory instance with newer pipelines with new datasets/linked services.