2
votes

I am trying to understand how Greedy Algorithm scheduling problem works.

So I've been reading and googling for a while since I could not understand Greedy algorithm scheduling problem.

We have n jobs to schedule on a single resource. The job (i) has a requested start time s(i) and finish time f(i).

There are some greedy ideas which we select...

  • Accept in increasing order of s ("earliest start time")
  • Accept in increasing order of f - s ("shortest job time")
  • Accept in increasing order of number of conflicts ("fewest conflicts")
  • Accept in increasing order of f ("earliest finish time")

And the book says the last one, accept in increasing order of f will always gives an optimal solution.

However it did not mention why it always gives optimal solution and why other 3 will not give optimal solution.

They provided the figure that says why other three will not provide optimal solution but I could not understand what it means.

Since I have low reputation, I can not post any image so I will try to draw it.

 |---| |---| |---|
|-------------------------|
increasing order of s underestimated solution

|-----------| |-----------|
   |-----|
increasing order of f-s underestimated solution

|----|  |----| |----|  |----|

 |-----| |-----| |-----|

 |-----|    |-----|

 |-----|    |-----|

increasing order of number of conflicts. underestimated solution

This is what it looks like and I don't see why this is a counterexample of each scenario.

If anyone can explain why each greedy idea does/ does not work, it will be very helpful.

Thank you.

2
Please specify more what are your conditions. Will you know all the staring and finishing times in advance, or will new task come after some time? Are you allowed to to move your task to be processed at another time? If so, will you receive some penalty for doing so? How do you evaluate what is a good result? Is only a fully performed task worth doing? Is doing long task more important than doing shorter ones?kajacx

2 Answers

5
votes

I think I can explain this.
Lets say, we have n jobs, start times as s[1..n] and finish times as f[1..n]. So if we sort it according to finish times, then, we will always be able to complete most number of tasks. Lets see, how.

If a job is finishing earlier (even if it started later in the series, a short job), then, we always have more time for later jobs. Lets assume, we have other jobs that we could start/complete in this interval so that our number of tasks could increase. Now, this is not actually possible as if any task completed before this, then that would be the one with earliest finish time so we would be working on that one. And, if any task has not been completed till now (but has started), then if we selected that, we would not have completed any task but now we actually have done one at least. So, in any case, this is the most optimal choice.
There are many possible solutions with maximum number of tasks that can be done in an interval, EFT gives one such solution. But it is always the max number possible.

I hope I could explain it well.

4
votes

Since @vish4071 has already explained why selecting earliest finish time will lead to optimal solution, I'll only explain the counterexamples. Task [a,b] starts at a and ends at b. I'll use the counterexamples you have provided.

  1. Earliest start time

Suppose tasks [1,10], [2,3], [4,5], [6,7]. The earliest start time strategy will choose [1,10] and then refuse the other 3, since they all collide with the first one. Yet we can see that [2,3], [4,5], [6,7] is the optimal solution, so earliest start time strategy will not always yield the optimal result.

  1. Shortest execution time

Suppose tasks [1,10], [11,20], [9,12]. This strategy would choose [9,12] and then reject the other two, but optimal solution is [1,10], [11,20]. Therefore, shortest execution time strategy will not always lead to optimal result.

  1. Least amount of collisions

This strategy seems promising, but your example with 11 task proves it not to be optimal. Suppose tasks: [1,4], 3x[3,6], [5,8], [7,10], [9,12], 3x[11,14] and [13, 16]. [7,10] has only 2 collisions with other tasks, which is less than any other task, so it would be selected first by the least amount of collisions strategy. Then [1,4] and [13, 16] would be selected, and all the other tasks rejected because they collide with already selected tasks. That is 3 tasks, however 4 tasks can be selected without collision: [1,4], [5,8], [9,12] and [13, 16].

You can also see that the earliest finish time strategy will always choose the optimal solution in these examples. Note that more than one optimal solution can exist with same number of selected tasks. In such case, earliest finish time strategy will always choose one of them.