Dec 2018 Update :
If you are thinking of doing this using azure function, azure data factory NOW provides you with an azure function step! the underlying principle is the same as you will have to expose the azure function with a HTTP trigger. however this provides better security since you can specify your data factory instance access to the azure function using ACL
Reference : https://azure.microsoft.com/en-us/blog/azure-functions-now-supported-as-a-step-in-azure-data-factory-pipelines/
Orginal Answer
- From the comments posted I believe you dont want to use custom activities route.
- You could try using a copy task for this, even though probably this is not the intended purpose.
- there is a
httpConnector
available for copying data from a web source.
https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-http-connector
- the copy task triggers an http endpoint,
- you can specify a variety of authentication mechanisms from Basic to
OAuth2.
- below I am using the end point to trigger the azure function process, the output is saved in datalake folder for logging (you can use other things obviously, like in your case it would be blob storage.)
Basic linked Service
{
"name": "linkedservice-httpEndpoint",
"properties": {
"type": "Http",
"typeProperties": {
"url": "https://azurefunction.api.com/",
"authenticationType": "Anonymous"
}
}
}
Basic Input Dataset
{
"name": "Http-Request",
"properties": {
"type": "Http",
"linkedServiceName": "linkedservice-httpEndpoint",
"availability": {
"frequency": "Minute",
"interval": 30
},
"typeProperties": {
"relativeUrl": "/api/status",
"requestMethod": "Get",
"format": {
"type": "TextFormat",
"columnDelimiter": ","
}
},
"structure": [
{
"name": "Status",
"type": "String"
}
],
"published": false,
"external": true,
"policy": {}
}
}
Output
{
"name": "Http-Response",
"properties": {
"structure": [
...
],
"published": false,
"type": "AzureDataLakeStore",
"linkedServiceName": "linkedservice-dataLake",
"typeProperties": {
...
},
"availability": {
...
},
"external": false,
"policy": {}
}
}
Activity
{
"type": "Copy",
"name": "Trigger Azure Function or WebJob with Http Trigger",
"scheduler": {
"frequency": "Day",
"interval": 1
},
"typeProperties": {
"source": {
"type": "HttpSource",
"recursive": false
},
"sink": {
"type": "AzureDataLakeStoreSink",
"copyBehavior": "MergeFiles",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
}
},
"inputs": [
{
"name": "Http-Request"
}
],
"outputs": [
{
"name": "Http-Response"
}
],
"policy": {
...
}
}