3
votes

I have a question about the algorithmic complexity of a job (non-preemptive) scheduling algorithm, used in particular to obtain the optimal solution to a 1||sum(Uj) problem (one machine only), where Uj is a unitary penalty (that is: equal to 0 if the job terminates by its deadline, and equal to 1 if it terminates late).

I need to minimize the sum of unitary penalties (that is: the sum of late jobs). The job processing times are given, together with their deadlines and there's no release time (so, all the jobs are released synchronously at time t=0). The algoritm works like this:

1- sequence all the jobs according to Earliest Due Date

2- identify the first late job Jj. If there are no late jobs go to step 4.

3- identify the largest processing time job between Jj and its predecessors. Remove that job and return to step 2, considering the new sequence without the removed job

4- The optimal sequence is given by: current sequence [followed by] subset of jobs removed at step 3 in any order.

I'm asked to compute the worst case O(...) complexity of this algorithm, in a general and good implemented case (so, I have to try to guess the complexity "implementation-free" for the moment, given all the algorithm steps).

I think that step 1 requires O(nlogn) since (using QuickSort or MergeSort) it's just a matter of ordering.

But then, what about steps 2-3-4? Is it correct to state that the final complexity may be O(n^2) if we try to implement this for example in C/C++? Or is it O(n*logn) thanks to EDD "optimizing" the subsequent search? What could be a possible proof to determine its complexity?

Thanks a lot in advance.

1

1 Answers

2
votes

It seems to me that this whole algorithm can be done in O(n log n) time. The basic idea is to amortize the cost of removing predecessors. Each time we remove a predecessor, we will pay O(log n), and there will be at most n removals.

Here's the specifics. After step 1, you only need to do a single pass over the jobs, since deleting jobs from the sequence can only make jobs finish sooner. Every time you consider a job in the sequence, you repeatedly remove its largest predecessor until it finishes on time. Notice that the total number of "remove the largest predecessor" operations is at most n, since each job will be removed at most once. [Note: I believe some care is needed here because "predecessor" should be an inclusive notion. That is, when considering a particular job j, if j is larger than any previous job, then you should remove j and move on to the next job.]

How do you implement the "remove the largest predecessor" operation efficiently? You can do this by maintaining (as you walk through the jobs left-to-right) an ordering on predecessors by job size. This ordering can be implemented with, for example, a binary search tree, which permits insertions and deletions in O(log n). Every time you begin considering a job, you insert it into the tree, paying O(log n). Deleting the largest predecessor is simply just deleting the largest job from the tree.

Earlier we noticed that there will be at most n removals, corresponding to at most n deletions from the tree. Thus we pay a total cost of O(n log n) for deletions. We also pay a total cost of O(n log n) for insertions (exactly one insertion per job). The preprocessing step of sorting all the data is O(n log n), so the total cost of all steps is O(n log n).

Here's some pseudocode:

J = input sorted by earliest due date
T = empty tree
for each j in J, from left to right:
  insert j into T
  while j does not finish on time:
    j' = delete largest from T
    if j' == j:
      break to next job in J

The only detail I've left out is how to check if a job finishes on time. But that is pretty straightforward, since you can keep track of the "current finishing time" simply by adding and subtracting from a counter as jobs are inserted and deleted from the tree.