0
votes

I have a cosmos DB collection in the following format:

{
    "deviceid": "xxx",
    "partitionKey": "key1",
    .....
    "_ts": 1544583745
}

I'm using Azure Data Factory to copy data from Cosmos DB to ADLS Gen 2. If I copy using a copy activity, it is quite straightforward. However, my main concern is the output path in ADLS Gen 2. Our requirements state that we need to have the output path in a specific format. Here is a sample of the requirement:

outerfolder/version/code/deviceid/year/month/day

Now since deviceid, year, month, day are all in the payload itself I can't find a way to use them except create a lookup activity and use the output of the lookup activity in the copy activity.

enter image description here

And this is how I set the ouput folder using the dataset property:

enter image description here

I'm using SQL API on Cosmos DB to query the data.

Is there a better way I can achieve this?

1

1 Answers

1
votes

I think that your way works, but its not the cleanest. What I'd do is create a different variable inside the pipeline for each one: version, code, deviceid, etc. Then, after the lookup you can assign the variables, and finally do the copy activity referencing the pipeline variables.

Proposed pipeline

It may look kind of redundant, but think of someone (or you 2 years from now) having to modify the pipeline and if you are not around (or have forgotten), this way makes it clear how it works, and what you should modify.

Hope this helped!!