1
votes

The Azure Batch .net tutorial shows deleting tasks, then jobs, then pools. At the same time, I found this article about cleanup, instead of deleting jobs and pools, it just deletes VMs which are no longer used. Is this a good strategy? I see that it can help keep track of tasks and general organization, but what are the implications of keeping tasks, jobs and pools after deleting the completed tasks’ vms?

Update 2018-05-15

Maximum number of tasks in a given job is 7770.

I have followed the answer below and finally ended up with a job that had accumulated 7770 tasks. After this magical number, the batch service was no longer able to add new tasks to the job, throwing the following exception:

System.AggregateException: One or more errors occurred. ---> Microsoft.Azure.Batch.Common.BatchException: InternalError: Server encountered an internal error. Please try again after some time.
RequestId:e6ab60e0-5c3b-4116-9ffb-ba2032154318
Time:2018-05-15T11:17:17.2186951Z
   at Microsoft.Azure.Batch.Protocol.CloudPoolOperations.<GetAsync>d__65.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Batch.Protocol.BatchRequest`2.<ExecuteRequestWithCancellationAsync>d__c.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Batch.Protocol.BatchRequest`2.<ExecuteRequestAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Batch.ProtocolLayer.<ProcessAndExecuteBatchRequest>d__11b`2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Batch.PoolOperations.<GetPoolAsync>d__3.MoveNext()
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at Microsoft.Azure.Batch.PoolOperations.GetPool(String poolId, DetailLevel detailLevel, IEnumerable`1 additionalBehaviors)
   at Pepe.Helpers.Batch.CreatePool()
   at Pepe.Helpers.LogEntryMaintainer.LaunchJob(LogFile log, PepeEntities db)

I suggest a regular cleanup of historical data. If you need to retain this information somewhere, I suggest moving somewhere, like table storage.

1
The batch-file tag is for Windows .bat files. Not sure what this has to do with that.Squashman
@Squashman it keeps changing “batch” to “batch-file”. Strange thing.Nomenator

1 Answers

8
votes

You don't need to delete tasks or their jobs at all and there's no real downside of leaving them in the system. You do however need to terminate jobs on completion as there is a quota on the number of active jobs within a Batch account.

With respect to pools and VMs, that's really up to you as to how you want to manage resources (particularly given these cost you money). Common approaches include;

  • Explicitly create a pool to run one or more jobs. Delete the pool once finished. The pool can also be explicitly resized if you want to increase/decrease the number of VMs.
  • Use the Autopool feature to specify a pool to be created automatically with job submission. The pool will be automatically deleted when the job is terminated.
  • Explicitly create a pool and define an autoscale formula that can scale the pool up and down as jobs are submitted and complete. This is a very powerful feature for maximizing resource utilization and minimizing cost.

Hope that helps.