0
votes

We have several projects running on Azure.

As they need some separation from each other, we need to establish an Azure Data Factory per each project (as ADF doesn’t have a ACL within itself). Each project will have its own GIT repository via DevOps (each project has it’s own DevOps project, so separate GITs) , we end up with each project’s ADF being connected to their own GIT. So we have:

Project1.ADF <-> Project1.DevOpsProject1.GIT

Project2.ADF <-> Project2.DevOpsProject2.GIT

We want to be able to connect to Azure Databricks from each ADF. We want to avoid multiplying Azure Databricks due to cost (plus databricks has ACL within that we can use). However, then the databricks workspace can only be connected to a single GIT repository. So if each project is to work on the same databricks then we need a databricks repository shared between the different projects.

Apart from the repository being in Project1.DevOps1.GIT and just that repository shared to Project2 (or vice versa), is there any better way?

1

1 Answers

0
votes

This was actually much simpler than I thought - you can specify on each notebook the GIT repository you want to connect to. So we can still have

Project1.ADF <-> Project1.DevOpsProject1.GIT

Project1.Databricks <-> Project1.DevOpsProject1.GIT

Project2.ADF <-> Project2.DevOpsProject2.GIT

Project2.Databricks <-> Project2.DevOpsProject2.GIT

enter image description here

enter image description here