I want to call a rest api and save the results as a csv or json file in Azure Data Lake Gen2. Based on what I have read Azure Functions is the way to go.
The webservice returns data like the following format:
"ID","ProductName","Company"
"1","Apples","Alfreds futterkiste"
"2","Oranges","Alfreds futterkiste"
"3","Bananas","Alfreds futterkiste"
"4","Salad","Alfreds futterkiste"
...next rows
I have written a console app in C# which at the moment outputs the data to a console. The webservice uses pagination and returns 1000 rows (determined by the &num-parameter with a max of 1000). After the first request i can use the &next-parameter to fetch the next 1000 rows based on ID. For instance the url
http://testWebservice123.com/Example.csv?auth=abc&number=1000&next=1000
will get me rows from ID 1001 to 2000. (the call of the API and the pagination in reality is a bit more complex and thus I cannot use for instance Azure Data Factory_v2 to do the load to Azure Data Lake - this is why I think i need Azure Functions - unless I have overlooked another servic??. So the following is just a demo to learn how to write to Azure Data Lake.)
I have the following C#:
static void Main(string[] args)
{
string startUrl = "http://testWebservice123.com/Example.csv?auth=abc&number=1000";
string url = "";
string deltaRequestParameter = "";
string lastLine;
int numberOfLines = 0;
do
{
url = startUrl + deltaRequestParameter;
WebClient myWebClient = new WebClient();
using (Stream myStream = myWebClient.OpenRead(url))
{
using (StreamReader sr = new StreamReader(myStream))
{
numberOfLines = 0;
while (!sr.EndOfStream)
{
var row = sr.ReadLine();
var values = row.Split(',');
//do whatever with the rows by now - i.e. write to console
Console.WriteLine(values[0] + " " + values[1]);
lastLine = values[0].Replace("\"", ""); //last line in the loop - get the last ID.
numberOfLines++;
deltaRequestParameter = "&next=" + lastLine;
}
}
}
} while (numberOfLines == 1001); //since the header is returned each time the number of rows will be 1001 until we get to the last request
}
I want to write the data to a csv-file to the data-lake in the most effective way. How would I rewrite the above code to work in Azure Function and save to a csv in Azure data lake gen2?