Scheduling, Greedy algorithm

Question

This is a variation of the popular El Goog problem.

Consider the following scheduling problem: There are n jobs, i = 1..n. There is 1 super computer and unlimited PCs. Each job needs to be pre-processed by the super computer first and then processed on a PC. The time required for job i on the super computer is si, i = 1..n. For PCs, it's pi, i = 1..n. The PCs can work in parallel, but the super computer can only work on 1 job at a time. A schedule S is created according to which the super computer will process the jobs. The completion time Ti(S) in schedule S is given by the clock time at which the job finishes on the PC. We want to find a schedule that minimizes Maxi[Ti(s)] (to be read as: we need to find a schedule that minimizes the highest finish time). The following greedy algorithm is proposed: Arrange the jobs in the decreasing order of processing times on the PCs. The complexity of this algorithm is O(nlogn). Either prove that this algorithm produces an optimal solution or provide a counter-example to show that it does not.

My solution (not sure if this is correct): I don't think it matters how we order the jobs. The highest finish time will still be the same. Consider this example of the processing times on the PCs for a list of jobs: <5, 7, 17, 8, 10>. This will yield finish times as <5, 12, 29, 37, 47>. As per the algorithm, we will sort the list as <17, 10, 8, 7, 5> and will yield finish times of <17, 27, 35, 42, 47>. So while technically, the greedy algorithm does give an optimal ordering, it takes nlogn time to do that, while simply iterating through the list of jobs gives us the same result.

If anyone thinks that the greedy algorithm will work better or that my approach is flawed, I'd appreciate your thoughts. Thanks!

Update: I think I may have the answer. The time taken by the super computer doesn't matter. The key here is that the PCs run in parallel. From the initial example of pi = <5, 7, 17, 8, 10>, let's add si = <8, 5, 1, 12, 9>. Now, in the default, unsorted order, we'd have processing times of <13, 20, (8 + 5 + 1 + 17 = )31, 34, 45>. So 45 is the time of completion. Assume a sorted order in which pi's are decreasing. The output is: <18, 20, 30, 34, 40>. [Sorted input: pi = <17, 10, 8, 7, 5>, si = <1, 9, 12, 5, 8>].

Here's an example that probably clears the whole thing up: si = <17, 10>, pi = <10, 17>. The output in the unsorted case (which also happens to be sorted in descending order of si) will be <27, 44>. Sorting based on pi, the input is: si = <10, 17>, pi = <17, 10>. Output is <27, 37>. Since the PCs run in parallel, you want to send the shortest job last.

how do you justify completely ignoring pi values in your solution? — Ali Ferhat
@AliFerhat: I think the same argument holds for the super computer as well? Changing the order of pi does not seem to affect the completion time of the last job, but I could be wrong. — user183037

Saeed Amiri Saeed Amiri · Accepted Answer · 2012-04-05T09:15:14

For Limited number of PCs:

w.l.o.g Assume there is no super computer, your problem will be converted to Minmum Makespan Scheduling problem (or ppt), which is NP-Hard. So your current algorithm doesn't works or P = NP.

But greedy algorithms are useful for approximations, Also you can convert Bin Packing to this problem and by fixed amount of error for approximation find good results, but runtime will be not good polynomial (e.g like n^10).

P.S: You can simply assume there is no super computer, for this assume the instance such that Max(s_i) < Min(P_i).

P.S2: I didn't see unlimited number of PCs at first, so I wrote this, I'll think about unlimited number of PCs.

UnLimited case:

Your current algorithm is wrong, assume this conditions:

For PCs: <5, 7, 17, 8, 10>
For super computer: <1000,800,500,600,700).

Your current solution will fails.

Scheduling, Greedy algorithm

4 Answers