4
votes

I know the algorithm exists but i and having problems naming it and finding a suitable solutions.

My problem is as follows:

  • I have a set of J jobs that need to be completed.

  • All jobs take different times to complete, but the time is known.

  • I have a set of R resources.

  • Each recourse R may have any number from 1 to 100 instances.

  • A Job may need to use any number of resources R.

  • A job may need to use multiple instances of a resource R but never more than the resource R has instances. (if a resource only has 2 instances a job will never need more than 2 instances)

  • Once a job completes it returns all instances of all resources it used back into the pool for other jobs to use.

  • A job cannot be preempted once started.

  • As long as resources allow, there is no limit to the number of jobs that can simultaneously execute.

  • This is not a directed graph problem, the jobs J may execute in any order as long as they can claim their resources.

My Goal: The most optimal way to schedule the jobs to minimize run time and/or maximize resource utilization.

2
You would have to define what you mean by "best" way to compete all jobs, but this would still be off-topic for Stack Overflow.chepner
google.com/… a hybrid metaheuristic for the resource-constrained project scheduling problemLouis Ricci
Sounds like this resource optimization problem of the ICON challenge. Metaheuristics worked well on it. The open source implementation you see in that video won 2th prize.Geoffrey De Smet
Just got back to work for the day, making progress on the solution that you posted harold, will hopefully have more in a few hours.Kaiser
@Kaiser, could u please share your equations of this problem?alanextar

2 Answers

3
votes

I'm not sure how good this idea is, but you could model this as an integer linear program, as follows (not tested)

Define some constants,

Use[j,i] = amount of resource i used by job j
Time[j] = length of job j
Capacity[i] = amount of resource i available

Define some variables,

x[j,t] = job j starts at time t
r[i,t] = amount of resource of type i used at time t
slot[t] = is time slot t used

The constraints are,

// every job must start exactly once
(1). for every j, sum[t](x[j,t]) = 1
// a resource can only be used up to its capacity
(2). r[i,t] <= Capacity[i]
// if a job is running, it uses resources
(3). r[i,t] = sum[j | s <= t && s + Time[j] >= t] (x[j,s] * Use[j,i])
// if a job is running, then the time slot is used
(4). slot[t] >= x[j,s] iff s <= t && s + Time[j] >= t

The third constraint means that if a job was started recently enough that it's still running, then its resource usage is added to the currently used resources. The fourth constraint means that if a job was started recently enough that it's still running, then this time slot is used.

The objective function is the weighted sum of slots, with higher weights for later slots, so that it prefers to fill the early slots. In theory the weights must increase exponentially to ensure using a later time slot is always worse than any configuration that uses only earlier time slots, but solvers don't like that and in practice you can probably get away with using slower growing weights.

You will need enough slots such that a solution exists, but preferably not too many more than you end up needing, so I suggest you start with a greedy solution to give you a hopefully non-trivial upper bound on the number of time slots (obviously there is also the sum of the lengths of all tasks).

There are many ways to get a greedy solution, for example just schedule the jobs one by one in the earliest time slot it will go. It may work better to order them by some measure of "hardness" and put the hard ones in first, for example you could give them a score based on how badly they use a resource up (say, the sum of Use[j,i] / Capacity[i], or maybe the maximum? who knows, try some things) and then order by that score in decreasing order.

As a bonus, you may not always have to solve the full ILP problem (which is NP-hard, so sometimes it can take a while), if you solve just the linear relaxation (allowing the variables to take fractional values, not just 0 or 1) you get a lower bound, and the approximate greedy solutions give upper bounds. If they are sufficiently close, you can skip the costly integer phase and take a greedy solution. In some cases this can even prove the greedy solution optimal, if the rounded-up objective from the linear relaxation is the same as the objective of the greedy solution.

1
votes

This might be a job for Dykstra's Algorithm. For your case, if you want to maximize resource utilization, then each node in the search space is the result of adding a job to the list of jobs you'll do at once. The edges will then be the resources which are left when you add a job to the list of jobs you'll do.

The goal then, is to find the path to the node which has an incoming edge which is the smallest value.

An alternative, which is more straight forward, is to view this as a knapsack problem.

To construct this problem as an instance of The Knapsack Problem, I'd do the following:

Assuming I have J jobs, j_1, j_2, ..., j_n and R resources, I want to find the subset of J such that when that subset is scheduled, R is minimized (I'll call that J').

in pseudo-code:

def knapsack(J, R, J`):
  potential_solutions = []
  for j in J:
    if R > resources_used_by(j):
      potential_solutions.push( knapsack(J - j, R - resources_used_by(j), J' + j) )
    else:
      return J', R
  return best_solution_of(potential_solutions)