1
votes

I am running Custom code activity in ADF v2 using Batch Service. Whenever this runs it only create one CloudTask within my Batch Job although I have more than two dozen parallel.Invoke methods running. Is there a way I can create multiple Tasks from one Custom Activity from ADF so that the processing can spread across all nodes in Batch Pool

I have fixed Pool with two nodes. Max Tasks are also set to 8 per node and Scheduling policy is also set to "Spread". I have only one Custom Task on my pipeline with Multiple Parallel.Invoke (Almost two Dozen).I was hoping this will create Multiple CloudTasks and will be spread Across both of my nodes as both nodes are single core. Looks like when each Custom Activity runs in ADF, it creates only one Task (CloudTask) for Batch Service.

My other hope was to use

https://docs.microsoft.com/en-us/azure/batch/tutorial-parallel-dotnet

and manually create CloudTasks in my console application and create Multiple Tasks Programatically and then run that Console Application with ADF Custom Activity but CloudTask takes JobId and Cmd. Wanted to something like following but instead of passing taskCommandLine, I wanted to pass a C# method name and parameters to execute

string taskId = "task" + i.ToString().PadLeft(3, '0');
string taskCommandLine = "ping -n " + rand.Next(minPings, maxPings + 
1).ToString() + " localhost";
CloudTask task = new CloudTask(taskId, taskCommandLine); 
// Wanted to do CloudTask task = new CloudTask(taskId, 
SomeMethod(args));
tasks.Add(task);

Also it looks like we can't create CloudTasks by using .NET API for Batch within Custom Activity of ADF

What I wanted to Achieve?

I have data in SQL Server table and I want to run different transformations on it by slicing it Horizontally or Vertically (by picking rows or columns). I want to run those transformations in Parallel (wants to have multiple CloudTask instances so that each one can operate on a specific Column Independently and after transformation load it into a different table). But the issue is it looks like we can't use .NET Batch Service API within ADF and the only way seems to be having multiple Custom Activities in my Data Factory pipeline.

1
Is this doing the same single task in parallel or multiple versions of the same task in parallel?iamdave
I have single Custom Task with multiple Parallel.Invoke inside my code and hoping this will spread the execution among both of my nodes in Batch pool. Can we use .NET API for Batch in ADF to create more than one CloudTasks within one Custom Activity?InTheWorldOfCodingApplications
Hi, did you find a solution for this. Dealing with somewhat same scenario. Any leads will be appreciated.PREETI BANSAL

1 Answers

0
votes

Application needs to deployed on each and every node within Batch pool and CloudTasks needs to be created by calling the application with cmd

CloudTask task =
new CloudTask(
    "MyTask",
    "cmd /c %AZ_BATCH_APP_PACKAGE_MyTask%\\myTask.exe -args -here");