2
votes

I have a stream analytics job which constantly dumps data in Cosmos DB. The payload has a property "Type" which determines the payload itself. i.e. which columns are included in the payload. It is an integer value of either 1 or 2.

I'm using Azure Data Factory V2 to copy data from Cosmos DB to Data Lake. I've created a pipeline with an activity that does this job. I'm setting the output path folder name using :

@concat('datafactoryingress/rawdata/',dataset().productFilter,'/',formatDateTime(utcnow(),'yyyy'),'/')

What I want in the datafactory is to identify the payload itself, i.e. determine if the type is 1 or 2 and then determine if the data goes in folder 1 or folder 2. I want to iterate the data from Cosmos DB and determine the message type and segregate based on message Type and set the folder paths dynamically.

Is there a way to do that? Can I check the Cosmos DB document to find out the message type and then how do I set the folder path dynamically based on that?

1
Ugh, you really should include diagram and/or script to explain the case with detail. It feels your case would become much simpler if you just implemented 2 separate flows for folder1 and folder2 and implementing the condition in source dataset. - Imre Pühvel

1 Answers

0
votes

Is there a way to do that? Can I check the Cosmos DB document to find out the message type and then how do I set the folder path dynamically based on that?

Unfortunately, based on the doc, dynamic content from source dataset is not supported by adf so far. You can't grab the fields in the source data as sink output dynamic parameters. Based on your situation, I suggest you setting up two separate pipelines to transfer data according to the Type field respectively.

If the Type field is varied and you do want to differentiate the output path, the ADF may not be the suitable choice for you. You could write logical code to fulfill your needs.