0
votes

We are uploading files into Azure data lake storage using Azure SDK for java. After uploading a file, Azure data factory needs to be triggered. BLOB CREATED trigger is added in a pipeline. Main problem is after each file upload it gets triggered twice.

To upload a file into ADLS gen2, azure provides different SDK than conventional Blobstorage.

SDK uses package - azure-storage-file-datalake.

DataLakeFileSystemClient - to get container

DataLakeDirectoryClient.createFile - to create a file. //this call may be raising blob created event

DataLakeFileClient.uploadFromFile - to upload file //this call may also be raising blob created event

I think ADF trigger is not upgraded to capture Blob created event appropriately from ADLSGen2.

Any option to achieve this? There are restrictions in my org not to use Azure functions, otherwise Azure functions can be triggered based on Storage Queue message or Service bus message and ADF pipeline can be started using data factory REST API.

1
Hello, If my answer is helpful for you, you can accept(mark) it as answer( click on the check mark beside the answer to toggle it from greyed out to filled in.). This can be beneficial to other community members. Thank you.Leon Yue

1 Answers

1
votes

You could try Azure Logic Apps with a blob trigger and a data factory action: enter image description here

Trigger: When a blob is added or modified (properties only):

  • This operation triggers a flow when one or more blobs are added or modified in a container. This trigger will only fetch the file metadata. To get the file content, you can use the "Get file content" operation. The trigger does not fire if a file is added/updated in a subfolder. If it is required to trigger on subfolders, multiple triggers should be created.

Action: Get a pipeline run

  • Get a particular pipeline run execution

Hope this helps.