0
votes

I would like to use data factory to regularly download 500000 json files from a web API and store them in a blob storage container. Then I need to parse the json files to extract some values from each file and store these values together with an ID (part of filename) in a database. I can do this using a ForEach activity and run a custom activity for each file, but this is very slow, so I would prefer some batch activity which could run the same parsing code on each file. Is there some way to do this?

1

1 Answers

0
votes

If your source json files have same schema, you can leverage the Copy Activity which can parse those files in a single run. But if possible, I would suggest to split those files into different sub folder (e.g. 1000 files per folder), so that each copy run needs less time and ease the management.

Refer to this doc for more details: https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview