0
votes

I am trying to build a so called "modern data warehouse" using Azure services.

First step is to gather all the data in its native raw format into Azure Data Lake store. For some of the data sources we have no other choice than to use API for consuming the data. There's not much information when searching, therefore I am asking.

Is it possible to define 2 Web Activities in my pipeline that will handle below scenario?

  1. Web1 activity gets an API URL generated from C# (Azure Function). It returns data in JSON format and saves it to Web1.Output - this is working fine.
  2. Web2 activity consumes Web1.Output and saves it into Azure Data Lake as a plain txt file (PUT or POST) - this is needed.

Above scenario is achievable by using Copy activity, but then I am not able to pass dynamic URL generated by Azure Functions. How do I save the JSON output to ADL? Is there any other way?

Thanks!

1

1 Answers

0
votes

Since you are using blob storage as an intermediary, and want to consume the blob upon creation, you could take advantage of Event Triggers. You can set up the Event trigger to run a pipeline containing Web2 activity. Which kicks off when the Web1 activity completes (separate pipeline).

By separating the two activities into separate pipelines, the workflow becomes asynchronous. This means you will not need to wait for both activities to complete before doing the next URL. There are many other benefits as well.