0
votes

I'm attempting to deploy an azure data factory with a copy data pipeline that pulls files from one or more deployed / on-prem file system paths and dumps them in blob storage. The source file paths on the file system may span multiple different drives (e.g. - C:\fileshare1 vs D:\fileshare2) and may include network locations referenced via UNC paths (e.g. - \localnetworkresource\fileshare3).

I'd like to configure a single local file system connection and source data set and just parameterize the linked service's host property. Then my pipeline would just iterate over a collection of file share paths and reuse the dataset and linked service connection. However, it doesn't look like there's any way to have the data set or pipeline provide the host information to the linked service. It's certainly possible to provide folder information from the pipeline and dataset, but that will be concatenated to the host specified in the linked service connection and therefore won't allow me access to different drives or network resources.

It was reasonably straightforward to do this by configuring separate linked service connections, data sets and pipelines for each distinct file share that needed to be included, but I'd prefer to manage a single pipeline.

I already tried to create the JSON of the linked services but it didn't work, someone who can help me?

https://docs.microsoft.com/en-us/azure/data-factory/parameterize-linked-services

2

2 Answers

3
votes

Yes, you can parameterize file system linked service as follows. First you need to create a Filesystem linked service, then you can modify the JSON code to add parameter section as below:

{
    "name": "OnPremFileSystemLinkedService_Parameterized",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "type": "FileServer",
        "parameters": {
            "HostParameter": {
                "type": "string",
                "defaultValue": "C:\\[Folder]"
            },
            "userIDParameter": {
                "type": "string",
                "defaultValue": "DOMAIN\\USERNAME"
            }
        },
        "annotations": [],
        "typeProperties": {
            "host": "@{linkedService().HostParameter}",
            "userId": "@{linkedService().userIDParameter}",
            "encryptedCredential": "XXXXXXXXXXXencryptedKeyXXXXXXXXX"
        },
        "connectVia": {
            "referenceName": "MySelfHostedIR",
            "type": "IntegrationRuntimeReference"
        }
    }
}

In my sample I just used single File share as input and a copy activity. But as per your requirement, you can pass your FileShare collection list to a ForEach activity and iterate over each FileShare and pass those values to your Copy Activity -> Source/Sink Data set parameters -> Linked service parameters properties.

Below is a sample on how to use a parameterized File system linked service

enter image description here

Hope this helps.

1
votes

This is how I solved it :)

the configuration was as follows: