I'm looking for advice on what the best practice is with regards to process orchestration. To give some context I have the following tasks to orchestrate:
- Scale up Azure Batch Pool to provide adequate nodes
- Execute custom .Net code which calls a server to retrieve a list of tasks. These tasks change on a daily bases. Queue these tasks onto the Batch Pool.
- Execute each task (custom .Net code) on the Batch Pool. Each task creates data within an Azure storage account.
- Scale down the batch pool as it is no longer required.
- Start / scale up the Data Warehouse
- Bulk Import the data into Data Warehouse (expect to be using a combination of PolyBase and BCP).
- Aggregate the data and produce output to an Azure Storage account.
- Pause / scale down the Data Warehouse
I'm currently comparing Data Warehouse to Runbooks to perform the above.
I find Runbooks are very primitive in terms of their visualisation during design and run time.
I find that Data Warehouse is much more visually apealing. However, the data slicing seems massive overkill. I simply want the process to execute at say 8am each morning. I don't want it to attempt to excute for days past (if I amend the template for example). I expect Data Warehouse will handle failure/resume better along the pipeline of activites also.
Are there any other approaches I should consider here / recommendations?
Thanks David